Gen Bank
Gen Bank
(Summary)
Abstract
The GenBank database is designed to provide and encourage access within the
scientific community to the most up-to-date and comprehensive DNA sequence
information; and it is a comprehensive, public database that contains 9.9 trillion base
pairs from over 2.1 billion nucleotide sequences for 478 000 formally described
species.
Integrated with NCBI's Entrez, it offersdiverse biological information. BLAST enables
effortless sequence similaritysearches. Regular updates and complete releases are
available via FTP.
Introduction
GenBank managedby the National Center for Biotechnology Information (NCBI), a
division of the National Library of Medicine (NLM), located on the campus of the US
National Institutes of Health (NIH) in Bethesda, MD, USA. After summarizing the
growth of GenBank in the past year, this paper will briefly review recent updates and
developments.
Methode
Genbank Records And Divisions:
GenBank groups sequence records into various (VRL) divisions based either on the
source taxonomy or the sequencing strategy used to obtain the data. There Is also
bacteria (BCT), Primates (PRT) and rodents (ROD).
High-throughput genomic (HTG) and cDNA (HTC) sequences areintegral, with HTG
sequences transitioning to a finished state. The accessionnumber system,versioning,
and unique identifiers like 'gi' facilitate trackingsequence changes.
GenBankreleases, available in flat-file and ASN.1 formats, can be obtained via
FTPfromNCBI and mirror sites. The database's continuous growth and
improvementsmake it a vital resource for the global scientific community.
Conclusions
GenBank data show that Zea mays and Oryza sativa are the most well-studied plant
species, having 3.6 billion and 1.5 billion bases of sequence in the database
respectively (Benson et al., 2008). The situation is completely different for the
genus Olea. In fact only a few sequences have been submitted in the last few years
and only 1037 core nucleotide, 24 EST (expressed sequence tag), and two GSS
(genome survey sequence) sequences were actually recovered from Entrez, the
NCBI’s retrieval system, which integrates the main DNA sequence databases (Figure
2.5 summarizes this information).