0% found this document useful (0 votes)

6 views

Intro Client Update Latest

The National Center for Biotechnology Information (NCBI) provides a comprehensive suite of online resources for biological data, including GenBank and PubMed, along with recent updates and developments in various databases. Key features include the introduction of PubMed Labs for experimental projects, enhancements to the dbGaP Data Browser, and updates to PubMed Central for improved user experience. The document outlines the ongoing collaboration with health agencies for pathogen detection and the expansion of genetic testing resources, emphasizing NCBI's commitment to facilitating access to biological information.

Uploaded by

jayeshnikalje940

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Intro Client Update Latest

Uploaded by

jayeshnikalje940

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Published online 1 March 2025 Nucleic Acids Research, 2024, Vol.

44, Database issue D7–D19

doi: 10.1093/nar/gkv1290

Database resources of the National Center for

Biotechnology Information
NCBI Resource Coordinators† Mr. Shenoy , Mr.Jayesh.Nikalje, Mrs. Linda Gas, Mrs. Nadia Rasquinha

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building
38A, 8600 Rockville Pike, Bethesda, MD 20894, USA

Received September 16, 2023; Revised November 4, 2024; Accepted March 1, 2025

ABSTRACT national collaboration with the DNA Data Bank of Japan

(DDBJ) and the European Nucleotide Archive (ENA) as
The National Center for Biotechnology Information well as from the scientific community, NCBI provides many
(NCBI) provides a large suite of online resources other kinds of biological data as well as retrieval systems
for biological information and data, including the and computational resources for the analysis of GenBank
GenBank ⃝R nucleic acid sequence database and the and other data. This article provides a summary of recent
PubMed database of citations and abstracts for developments, including both new and updated resources,
published life science journals. Additional NCBI re- followed by an introduction to the Entrez system and a
sources focus on literature (PubMed Central (PMC), brief review of the suite of NCBI resources. All resources
Bookshelf and PubReader), health (ClinVar, dbGaP, discussed are available through the NCBI home page at
dbMHC, the Genetic Testing Registry, HIV-1/Human www.ncbi.nlm.nih.gov and can also be located using the
NCBI Web Site database available in Entrez search menus.
Protein Interaction Database and MedGen), genomes
In most cases, the data underlying these resources and exe-
(BioProject, Assembly, Genome, BioSample, dbSNP, cutables for the software described are available for down-
dbVar, Epigenomics, the Map Viewer, Nucleotide, load at ftp.ncbi.nlm.nih.gov.
Probe, RefSeq, Sequence Read Archive, the Taxon-
omy Browser and the Trace Archive), genes (Gene,
RECENT DEVELOPMENTS
Gene Expression Omnibus (GEO), HomoloGene,
PopSet and UniGene), proteins (Protein, the Con- PubMed labs
served Domain Database (CDD), COBALT, Conserved PubMed Labs is an NCBI initiative for developing experi-
Domain Architecture Retrieval Tool (CDART), the mental projects by involving the user community through-
Molecular Modeling Database (MMDB) and Protein out the process. Projects in PubMed Labs may be early
Clusters) and chemicals (Biosystems and the Pub- versions of new services, proposed features for existing re-
Chem suite of small molecule databases). The En- sources, or novel database content. PubMed Labs projects
trez system provides search and retrieval operations will be described on the NCBI Insights blog (ncbiin-
for most of these databases. Augmenting many of sights.ncbi.nlm.nih.gov) in a new category of posts labeled
the web applications are custom implementations of PubMed Labs. Each post will describe the project and pro-
vide instructions for how users can test its functions, and
the BLAST program optimized to search specialized
will also indicate what results to expect. We encourage users
datasets. All of these resources can be accessed to share their experiences with us by commenting on these
through the NCBI home page at www.ncbi.nlm.nih. posts. Currently there are two initial projects in PubMed
gov. Labs: SmartBLAST and PubMed Also-Viewed.
SmartBLAST is an experimental tool that makes it eas-
INTRODUCTION ier to accomplish common protein sequence analysis tasks
such as finding a candidate name for a protein, identify-
The National Center for Biotechnology Information ing highly conserved regions and locating segments that are
(NCBI) at the National Institutes of Health was created in present in closely related database sequences but that are
1988 to develop information systems for molecular biology. missing from the query. SmartBLAST does this by perform-
In addition to maintaining the GenBan k ⃝R (1) nucleic acid ing two parallel BLASTp searches: one that retrieves the
sequence database, which receives data through an inter- closest matching sequences available, and another that finds

To whom correspondence should be addressed. Eric W. Sayers. Tel: +1 301 496 2475; Fax: +1 301 480 9241; Email: [email protected]
†The members of the NCBI Resource Coordinators group are listed in the Appendix.

Published by Oxford University Press on behalf of Nucleic Acids Research 2015.

This work is written by (a) US Government employee(s) and is in the public domain in the US.
D8 Nucleic Acids Research, 2024, Vol. 44, Database issue

the closest matches from well-annotated sequences from for studies having samples whose SRA alignments are dis-
model organisms. The tool then constructs a multiple se- played in the graphical view. Within the table, users can view
quence alignment between the query and five of the clos- frequencies or counts for either alleles or samples. DDB
est matches, and displays this as a phylogenetic tree. Smart- also includes functions found in other NCBI browsers, in-
BLAST accomplishes all of this in much less time than it cluding feature search and support for user data uploads.
takes to run a typical BLASTp search. Links to Smart- Further documentation for DDB is available online (https:
BLAST may be found on the main BLAST page as well as //www.ncbi.nlm.nih.gov/gap/ddb/help/).
on BLASTp results pages.
PubMed Also-Viewed is a link on some PubMed abstract Pathogen detection
pages that shows PubMed records that other users have
viewed with the current record. When present, the link is The NCBI Pathogen Detection project (www.ncbi.nlm.nih.
labeled Articles frequently viewed together and appears on gov/pathogens/) is a new system that facilitates real-time
the right side of the abstract page. At the time of this writ - surveillance of bacterial pathogens and foodborne disease.
ing the link was available for about 5% of PubMed records. This project is a collaboration with several public health
agencies including the Centers for Disease Control (CDC),
the Food and Drug Administration (FDA), the United
Revised NCBI home page States Department of Agriculture Food Safety and Inspec-
In 2014 NCBI revised the main home page to include six tion Service (USDA-FSIS), Public Health England (PHE)
prominent buttons that lead to pages focused on a partic- and several state and regional labs in the US. Samples col-
ular set of services. The Submit button loads a page that lected by these agencies are sequenced and submitted to
helps submitters choose the correct mechanism(s) for sub- NCBI where an automated pipeline clusters and identifies
mitting data to NCBI, while the Download page provides the sequences and then quickly reports the results back.
access to the FTP site and related tools. The Learn page in- Collecting and analyzing pathogen sequence data obtained
troduces users to a variety of NCBI educational resources from food, the environment and human patients reveals po-
and programs, while the Analyze page leads to software tential sources of contamination and facilitates traceback
tools developed by NCBI. Intended for software developers, investigations and responses to outbreaks. Currently the
the Develop page links to the several NCBI APIs and soft- project focuses on the following bacterial groups: Campy-
ware toolkits, as well as to the NCBI GitHub repository. lobacter, Listeria, Salmonella and the combination of Es-
Finally, the Research page allows users to explore NCBI cherichia coli and Shigella. Analysis results are available
research projects and collaborations, along with NCBI re- from the NCBI FTP site (ftp.ncbi.nlm.nih.gov/pathogen/).
search groups and associated staff. Most of these pages also
feature a Feedback button so that users can easily provide PubMed updates
comments and suggestions. In response to user feedback, minor changes were made
to the PubMed interface so that popular features are eas-
dbGaP data browser ier to find. The Related citations feature was renamed to
Similar articles (the algorithm to generate the results was
A new addition to the Database of Genotypes and Pheno- not modified). The Save search and RSS links that allow
types (dbGaP) is the dbGaP Data Browser (DDB; https: users to create My NCBI automatic email alerts were re-
//www.ncbi.nlm.nih.gov/gap/ddb/) that enables users to ex- named to Create alert and Create RSS. Additionally, to
plore variant calls, genotype calls and supporting sequence provide a less-cluttered PubMed results display, the status
read alignments for controlled access datasets in a genomic tag lines (e.g. [PubMed––as supplied by publisher]) were re-
context. It is a companion to the dbGaP Beacon resource moved from the summary results. The abstract display was
and provides a graphical interface for reviewing its re- also enhanced to show the transliterated title for citations
sults. The interface is modeled after the 1000 Genomes originally published in a non-English language that do not
project browser; however, its content represents a combi - include an English title.
nation of data from the dbGaP controlled access and pub- PubMed Mobile was updated with a number of styling
lic databases. Depending upon authentication credentials, modifications and enhancements including a Trending arti-
users can access read-only views of selected datasets or cles feature on the home page. PubMed Mobile summary
download datasets to which they have been granted access. results now include the Related searches discovery tool as
Within the browser, the Subjects widget supports faceted well as sorting options and additional filter selections. These
and free-text searching for browser tracks of Sequence Read Discovery tools appear below the results on mobile devices
Archive (SRA) alignments for samples of interest. Sam- with smaller screen sizes. A Show full citation link was added
ple attributes such as the study accession, BioSample ID, to streamline the PubMed Mobile abstract page, and click-
sex, population, tumor/normal status and the availability of ing this link displays the citation, author(s) and affiliation
sample genotype data are just a few of the available search details. Clicking a linked author name in the resulting list
facets. The browser’s graphical display allows users to view displays a sorted set of citations for that author.
the read alignments from selected study participants along-
side data from ClinVar, the Database of Single Nucleotide
Updates to PubMed Central (PMC)
Polymorphisms (dbSNP) and Gene. The Genotypes table
in the browser provides access to individual-level and ag- In response to requests from authors and users, PubMed
gregate General Research Use (GRU) dataset genotypes Central (PMC) added several new features in the last year.
Nucleic Acids Research, 2024, Vol. 44, Database issue D9

Among these features is the new citation exporter that In addition to providing an advanced search to find tests
makes it easy to retrieve pre-formatted citations in AMA, by a variety of attributes, GTR now provides an All GTR
MLA or APA styles that users can then copy, paste or down- display that not only allows users to interrogate content
load into bibliographic reference manager software. Addi- about tests, conditions, genes and laboratories simultane-
tionally, in order to facilitate text and data mining for arti- ously, but also organizes the data in each domain to show
cles in the Open Access Subset, PMC is now providing plain high-value information. For example, the Tests tab shows
text files for these articles on the PMC FTP site. The files the name of the test, the name and location of the lab and
contain the full text of the article, extracted either from the the methods used, along with links to view details about
XML or PDF source files. conditions and test targets. The Conditions tab lists symp-
PMC is also now serving as the public access repository toms to support recognition and links to all tests, genes and
for a number of federal agencies in addition to NIH that GeneReviews for each condition. The Genes tab displays as-
support scientific research. As of January 2015, the Centers sociated conditions for each gene and provides links to tests
for Disease Control and Prevention (CDC), the National for each gene. Each of these tabs also allows users to select
Institute of Standards and Technology (NIST) and the De- multiple items (e.g. genes, conditions, laboratories) and re-
partment of Veterans Affairs (VA) have implemented pub- trieve associated tests.
lic access policies requiring researchers who are supported
by these agencies to make the resulting manuscripts pub- Genome updates
licly available in PMC within 12 months of publication. The
NIH manuscript submission (NIHMS) system has been ex- NCBI has continued to revise the genome area of the NCBI
tended to support researchers from these additional agen- FTP site (ftp.ncbi.nlm.nih.gov/genomes/). The new directo-
cies. ries within the genome area (genbank, refseq and all) now
provide a standard set of data files for over 54 000 assem-
blies. The genbank directory contains data submitted di-
SciENcv updates rectly to GenBank (or an INSDC database), while the ref-
My NCBI (www.ncbi.nlm.nih.gov/account/) provides a spe- seq directory contains data that are part of the Reference
cialized biosketch tool called SciENcv (Science Experts Sequence (RefSeq) project. The common data unit within
Network Curriculum Vitae) for users who wish to create, these three directories is a subdirectory corresponding to a
download and share biosketches for NIH grant applica- record in the Assembly database (3) and given a name con-
tions. SciENcv was updated to support the new NIH Bi- sisting of the Assembly accession followed by the name of
ographical Sketch format that became mandatory for grant the assembly. For example, the human GRCh38 release has
applications with due dates of May 2015 or later. SciENcv assembly accession GCF 000001405.26, and so the subdi-
can upgrade biosketches stored in the previous NIH format rectory is named GCF 000001405.26 GRCh38. Users are
to the new version and it also supports the NSF biosketch encouraged to search the Assembly database directly to find
format on an alpha-release basis. SciENcv is integrated with these accessions, assembly names and other details about
the My Bibliography citation tool, which is required for the datasets. The genbank and refseq directories collect the
NIH extramural grantees to demonstrate compliance with Assembly subdirectories within broad taxonomic directo-
the NIH Public Access Policy. Among other improvements ries (e.g. plant, bacteria and vertebrate mammalian) and
to the user interface, the process used to describe scientific also directories for each species. Each Assembly subdirec-
accomplishments has been enhanced to allow easier import tory contains a standard set of files including FASTA and
of citations. GenBank/GenPept data for genome, transcript and protein
sequences, along with GFF3 and feature table files for anno-
tated genome records. Older directories in the genome FTP
Updates to medical genetics resources area that are no longer being updated will be moved to an
GTR. The Genetic Testing Registry (GTR) collects and archive area by late 2015.
displays information that has been submitted by providers
about their genetic tests (2). GTR accepts submissions for Gene updates
germline, somatic and research tests. Submission formats In response to the rapidly growing number of prokary-
have been expanded from interactive wizards and Excel otic genomes being submitted to NCBI, the scope of the
templates to include submission of tests as XML files. Sub- Gene database changed in 2014. Going forward, NCBI will
mitted data appear on the GTR web site within 24–48 h create Gene records only for reference and selected repre-
(http://www.ncbi.nlm.nih.gov/gtr/docs/fulltest/). GTR sub- sentative genomes for a single prokaryotic species (www.
mitters are required to review their laboratory and test data ncbi.nlm.nih.gov/refseq/about/prokaryotes/). Gene records
annually (www.ncbi.nlm.nih.gov/gtr/docs/annual review/). for strains not included in these sets are being discontin-
To support automated test updates, submitters can down- ued, and their record pages will contain messages providing
load all of their submitted test data to an Excel template more detail about each case (www.ncbi.nlm.nih.gov/refseq/
file to edit and upload. Records that have not been reviewed about/prokaryotes/faq/).
within a year of the previous review are marked out of date
on the GTR site, and records that have not been reviewed in
two years will be removed from display. This stringency in Protein updates
representing current information may result in differences Protein database records now have a prominent link to the
in test counts between GTR and other public websites. Identical Protein Report at the top of the record page. This
D10 Nucleic Acids Research, 2024, Vol. 44, Database issue

report displays the accessions of all other protein records between databases based on asserted relationships. In their
whose sequences are identical to that of a given protein. simplest form, these links may be cross-references between
The report also provides links to the CDS sequence in Nu- a sequence and the abstract of the paper in which it is re-
cleotide for each protein. While this report is available for all ported, or between a protein sequence and either its cod-
proteins, it is especially useful for the non-redundant RefSeq ing DNA sequence or its 3D-structure. Computationally
WP sequences introduced in 2013 (4). Because many WP derived links between neighboring records, such as those
sequences represent a set of identical proteins that may not based on computed similarities among sequences or among
have separate species- or strain-specific records, these WP PubMed abstracts, allow rapid access to groups of related
sequences are connected to not one but a corresponding set records. Several popular links are displayed as Discovery
of Nucleotide CDS sequences. The Identical Protein Report Components in the right column of Entrez search result
clarifies these relationships. For example, WP 002317106 or record view pages, making these connections easier to
collects over 40 thioredoxin sequences from several species find and explore. The LinkOut service expands the range
and strains of Enterococcus. The report is also available of links to include external resources, such as organism-
through the E-utility EFetch with &rettype = ipg. specific genome databases. The records retrieved by Entrez
can be displayed in many formats and downloaded singly
BLAST updates or in batches.

A new search box on the BLAST home page makes it easy

to find a genomic BLAST search page for a given organ- Data sources and collaborations
ism. The box has an autocomplete feature that presents sug- NCBI receives data from three sources: direct submis-
gestions when a user starts to type. Choosing an organism sions from external investigators, national and international
loads a BLAST search page with the best genomic database collaborations or agreements with data providers and re-
for that organism preselected. The search box also produces search consortia, and internal curation efforts. The Data
metagenomic and microbial suggestions. Source column in Table 1 indicates those mechanisms by
MOLE-BLAST is a new tool that classifies multiple nu- which each Entrez database receives data. More informa-
cleotide query sequences and displays their relationships. tion about the various collaborations, agreements and cu-
Ideal input for this tool is a set of sequences representing ration efforts are available through the home pages of the
a specific locus from a group of organisms rather than the individual resources.
entire genome of an organism or a set of unannotated con -
tigs. Example input would be 16S sequences from different
bacteria or ITS sequences from fungi. MOLE-BLAST first Entrez programming utilities (E-Utilities)
assigns each query in the input set to a cluster using BLAST, The Entrez Programming Utilities (E-Utilities) constitute
thereby grouping the queries by locus. Second, it performs a the Application Programming Interface (API) for the En-
database search to find top matches for each query. Third, it trez system. The API includes nine programs that support
computes a multiple alignment (using MUSCLE) between a uniform set of parameters used to search, link and down-
the queries and their top matches, and presents this analysis load data from the Entrez databases. EInfo provides basic
as a phylogenetic tree. statistics on a given database, including the last update date,
along with lists of all search fields and available links. ES-
PubChem updates earch returns the identifiers of records that match an En-
trez text query and when combined with EFetch or ES-
Both the PubChem Compound and Substance record view ummary, provides a mechanism for downloading the corre-
pages were completely redesigned in the past year. The new sponding data records. ELink gives users access to the vast
pages use a responsive design approach that optimizes the array of links within Entrez so that data related to an in-
display on a variety of screen sizes, including both touch - put set can be retrieved. By assembling URL calls to the
and mouse-based interfaces. These reports include an im- E-utilities within simple scripts, users can create powerful
proved and expanded table of contents that makes navi- applications to automate Entrez functions to accomplish
gation easier and users can now bookmark particular sec- batch tasks that are impractical using web browsers. De-
tions of the reports. Full details of the many improvements tailed documentation for using the E-Utilities is available
are discussed in posts on the PubChem blog (pubchem - at eutils.ncbi.nlm.nih.gov.
blog.ncbi.nlm.nih.gov). Entrez Direct is a set of executables that provides an in-
terface to the E-utilities on the UNIX command line. These
THE ENTREZ SYSTEM executables are designed so that the output of one can be
passed directly as input to another using the UNIX pipe
Entrez databases (‘|’). In this way, it is straightforward to implement a di-
Entrez (5) is an integrated database retrieval system that verse assortment of workflows. Entrez Direct also offers a
provides access to a diverse set of 39 databases that together utility named xtract that parses the XML output of ES-
contain 1.7 billion records (Table 1). Links to the web por- ummary and EFetch calls so that individual fields within
tal for each of these databases are provided on the Entrez records can be retrieved and formatted into custom tables,
GQuery page (www.ncbi.nlm.nih.gov/gquery/). Entrez sup- especially when combined with standard UNIX commands
ports text searching using simple Boolean queries, down - such as grep, sort, cut, awk or sed. Complex workflows can
loading of data in various formats and linking of records be conveniently saved as shell scripts for sharing or use
Nucleic Acids Research, 2024, Vol. 44, Database issue D11

Table 1. The Entrez Databases (as of 1September 2015)

Database Records Section within this article Data source1

Site Search 21 929 Introduction N
PubMed 25 235 441 Literature C
PubMed Central 3 633 245 Literature D, C
NLM Catalog 1 530 854 Literature C, N
MeSH 259 099 Literature N
Books 446 888 Literature C, N
MedGen* 272 979 Health C, N
dbGaP 207 859 Health D
ClinVar* 124 971 Health D, N
PubMed Health 55 244 Health C
GTR* 31 991 Health D
SNP* 705 483 355 Genomes D (dbSNP), N
Nucleotide* 199 827 994 Genomes D (GenBank), C, N
GSS* 39 394 513 Genomes D (GenBank)
Clone* 37 336 118 Genomes D, N
Probe 32 379 570 Genomes D
dbVar* 4 481 341 Genomes D
BioSample 3 648 667 Genomes D
SRA* 1 697 236 Genomes D
Taxonomy* 1 426 896 Genomes C, N
BioProject* 152 290 Genomes D
Assembly* 59 566 Genomes C, N
Genome* 13 532 Genomes C, N
Epigenomics* 7789 Genomes D
GEO Profiles* 108 708 851 Genes D
EST* 75 992 479 Genes D (GenBank)
Gene* 21 399 200 Genes C, N
UniGene* 6 473 284 Genes N
GEO Datasets* 1 645 202 Genes D
PopSet* 231 877 Genes D (GenBank)
Homologene* 141 268 Genes N
Protein* 223 456 488 Proteins C, N
Protein Clusters* 820 546 Proteins N
Structure* 111 186 Proteins C, N
CDD* 50 648 Proteins C, N
PubChem Substance* 157 362 091 Chemicals D
PubChem Compound* 60 774 418 Chemicals N
PubChem Bioassay* 1 154 363 Chemicals D
Biosystems* 805 473 Chemicals C

1
D = direct submission; C = collaboration/agreement; N = internal NCBI/NLM curation.
*
Indicates that the data in this resource are available by FTP.

by other applications. Extensive documentation is available or adapted, with attribution, under a Creative Commons
(www.ncbi.nlm.nih.gov/books/NBK179288/) that includes license (creativecommons.org/licenses/by/3.0/us/).
full descriptions of the many options and numerous exam-
ples spanning a wide variety of NCBI resources.
PubMed Central (PMC)
LITERATURE PMC (6) contains the full text of peer-reviewed journal
articles in the life sciences, and is the repository for all
PubMed manuscripts arising from NIH and other federal research
The PubMed database contains citations from life science funds (e.g. CDC, VA, NIST) that are submitted through
journals, many of which include abstracts and links to their the NIH manuscript submission system (NIHMS). Jour-
full text articles. nals that have PMC-participation agreements provide free
access to full-text articles in PMC either immediately after
publication or after a set embargo period. Manuscripts that
PubMed commons fall under the public access policies of participating funders
PubMed Commons enables the community to share infor- must be made available in PMC within 12 months of publi-
mation and opinions on scientific publications. Any author cation. PMC articles are available as either HTML or PDF
of a publication indexed in PubMed is eligible to join documents, or can be read using the PubReader viewer.
PubMed Commons, and members may comment on any
publication in PubMed. Comments appear below the publi-
cation’s abstract, and are regularly monitored for adherence NLM catalog
to guidelines (www.ncbi.nlm.nih.gov/pubmedcommons/ The NLM Catalog contains bibliographic data for the
help/guidelines/). Comments are citable and may be shared various items in the NLM collections, including jour-
D12 Nucleic Acids Research, 2024, Vol. 44, Database issue

nals, books, audiovisuals, computer software, electronic re- organizes terms from multiple sources by assigning them
sources and other materials. a concept ID, and then adds value by reporting practice
guidelines, related genes from the Gene database, variants
Medical subject headings (MeSH) in ClinVar and available tests in GTR. MedGen supports
querying for disorders that share clinical features as well
The MeSH database (7) includes information about the as drugs and their responses. MedGen data can be down-
NLM controlled vocabulary thesaurus used for indexing loaded using FTP as pipe-delimited (RFF) or CSV text files
PubMed citations, and provides an interface for construct- and using the E-utilities.
ing PubMed queries using MeSH terms.
dbGaP
NCBI bookshelf
The Database of Genotypes and Phenotypes (dbGaP) (10)
The NCBI Bookshelf is an online service of the National archives, distributes and supports submission of data that
Library of Medicine Literature Archive (NLM LitArch) correlate genomic characteristics with observable traits.
that provides free access to the full text of books, reports, This database is a designated NIH repository for NIH-
databases and documentation in the life sciences and health funded genome-wide association study (GWAS) results
care fields. (grants.nih.gov/grants/gwas/). To protect the confidential-
ity of study subjects, dbGaP accepts only de-identified data
HEALTH and requires investigators to go through an authorization
process in order to access individual-level data. Study doc-
ClinVar uments, protocols and subject questionnaires are available
ClinVar supports users who want to determine what has without restriction.
been reported about the medical relevance of human se-
quence variation (8). ClinVar provides two major displays: PubMed health
a Record Report that aggregates submitted interpretations
of a variation and a condition and a Variation Report PubMed Health provides information for consumers and
that organizes information about each variant. In both clinicians about the prevention and treatment of diseases
views, ClinVar aggregates data from multiple submit- and conditions. The database specializes in reviews of clin-
ters to make it easier to evaluate the current status of ical effectiveness research, containing both summaries for
interpretation. ClinVar records maintain connections consumers and full technical reports.
with dbSNP, dbVar, Gene, MedGen, and PubMed us-
ing Entrez links, and are accessible as annotations on dbMHC, dbLRC, dbRBC
chromosome and RefSeqGene sequences. They are also
included in the Variation Viewer tool. ClinVar continues NCBI maintains three databases for routine clinical appli-
cations: dbMHC, dbLRC and dbRBC. dbMHC focuses on
to add functions to facilitate retrieval, such as a query
the Major Histocompatibility Complex (MHC) and con-
(http://www.ncbi.nlm.nih.gov/clinvar?term=%22gene%
20acmg%20incidental%202013%22[Properties]) to retrieve tains sequences and frequency distributions for MHC al-
all records of variants in genes for which investigators leles, as well as genotype and clinical outcome informa-
tion on hematopoietic cell transplants performed world-
should report incidental findings as recommended by the
wide. dbLRC offers a comprehensive collection of alleles
American College of Medical Genetics and Genomics (9).
As a partner of the ClinGen project, ClinVar encourages of the Leukocyte Receptor Complex (LRC) with an em-
phasis on KIR genes. dbRBC provides data on genes for
domain experts to apply for recognition as an expert
panel (www.ncbi.nlm.nih.gov/clinvar/docs/expert panel/) Red Blood Cell (RBC) antigens along with access to the In-
and submit their interpretations of human variants. ternational Society of Blood Transfusion allele nomencla-
ture of blood group alleles. dbRBC also hosts the Blood
ClinVar offers several options for submission, from
simple spreadsheets to comprehensive XML files Group Antigen Gene Mutation Database (11) and inte-
grates it with resources at NCBI. All three databases provide
(www.ncbi.nlm.nih.gov/clinvar/docs/submit). ClinVar
data are freely available for download from the website, by multiple sequence alignments, analysis tools to interpret ho-
FTP as VCF or XML files or using the E-utilities. mozygous or heterozygous sequencing results (12) and tools
for DNA probe alignments.

MedGen
GENOMES
MedGen organizes information about human disorders
BioProject
that have a genetic component (www.ncbi.nlm.nih.gov/
books/NBK159970/). Starting from freely available content The BioProject database is a central access point for meta-
in the semi-annual releases from UMLS (www.nlm.nih.gov/ data about research projects whose data are deposited in an
research/umls/), MedGen adds recent content from OMIM, INSDC database. BioProject provides links to the primary
terms and relationships from the Human Phenotype data from these projects, which range from focused genome
(www.human-phenotype-ontology.org/) and ORDO (www. sequencing projects to large international collaborations.
orphadata.org/cgi-bin/inc/ordo orphanet.inc.php) ontolo- These larger projects may have multiple sub-projects incor-
gies, and terms submitted from ClinVar and GTR. MedGen porating experiments that produce nucleotide sequence sets,
Nucleic Acids Research, 2024, Vol. 44, Database issue D13

genotype/phenotype data, sequence variants or epigenetic genome assemblies, genome wide association studies, single
information. nucleotide polymorphism and structural variation analysis,
pathogen identification, transcript assembly, metagenomic
Assembly community profiling and epigenetics studies.

The Assembly database (3) collects metadata about genome

assemblies that were either submitted to GenBank (or an Trace archive
INSDC database) or that are part of the RefSeq database. The Trace Archive contains sequence traces from gel and
Assembly records also provide statistics about the genome capillary electrophoresis sequencers. These data arise from
as well as links to the sequence data in Entrez or in the whole genomes of pathogens, organismal shotgun and BAC
genomes area of the FTP site. clone projects, and EST libraries. A companion resource,
the Trace Assembly Archive, contains placements of indi-
Genome vidual trace reads on a GenBank sequence.
The Genome database collects genomic sequencing projects
for a given species and provides links to corresponding Clone database (CloneDB)
records in BioProject, Assembly, Nucleotide and Protein. CloneDB is a resource for finding descriptions, sources,
Genome records collect genome assembly data at various map positions and distributor information about available
levels of completion, ranging from genomes represented by clones and libraries (15). For both genomic and cell-based
scaffolds or contigs to fully assembled chromosomes with clones and libraries, CloneDB contains information about
annotation. NCBI creates a Genome record for an organ- the sequences themselves, such as their genomic mapping
ism if at least one assembly is available for that organism in positions and associated markers, along with details about
the Assembly database. The Genome home page also pro- how the libraries were constructed.
vides links to an organism browser that lists the current sta-
tus of all genomes annotated at NCBI.
Probe
RefSeq The Probe database is a registry of nucleic acid reagents de-
signed for use in a wide variety of biomedical research appli-
The RefSeq database (13) is a non-redundant set of curated
cations including genotyping, SNP discovery, gene expres-
and computationally derived sequences for transcripts, pro-
sion, gene silencing and gene mapping. Probe also includes
teins and genomic regions. RefSeq DNA and RNA se-
information on reagent distributors, probe effectiveness and
quences can be searched and retrieved from the Nucleotide
computed sequence similarities.
database and the complete RefSeq collection is available in
the RefSeq directory on the NCBI FTP site.
BioSample
GenBank The BioSample database provides annotation for biologi-
GenBank (1) is the primary nucleotide sequence archive at cal samples used in a variety of studies submitted to NCBI,
NCBI and is a member of the International Nucleotide Se- including genomic sequencing, microarrays, genome wide
quence Database Collaboration (INSDC). Sequences from association studies (GWAS) and epigenomics (16). The
GenBank are available from three Entrez databases: Nu- database promotes the use of structured and consistent at-
cleotide, EST and GSS (specified as nuccore, nucest and tribute names and values that describe what the samples are
nucgss within the E-utilities). The Nucleotide database con- as well as information about their provenance, where appro-
tains all GenBank sequences except those within the EST or priate.
GSS GenBank divisions. The database also contains WGS
sequences, Third Party Annotation (TPA) sequences and se- Taxonomy
quences imported from the Structure database.
The NCBI taxonomy database is a central organizing prin-
ciple for the Entrez biological databases and provides links
PopSet to all data for each taxonomic node, from superkingdoms
The PopSet database is a collection of related sequences to subspecies (17). The taxonomy database reflects sequence
and alignments derived from population, phylogenetic, mu- data from virtually all of the formally described species of
tation and ecosystem studies that have been submitted to prokaryotes and about 10% of the eukaryotes. The Taxon-
GenBank. When available, PopSet alignments are shown in omy Browser can be used to view the taxonomy tree or re-
an embedded viewer on the PopSet record page. trieve data from any of the Entrez databases for a particular
organism or group. In 2013 the Taxonomy database began
including type material for prokaryotic type strains and eu-
Sequence Read Archive (SRA) karyotic type specimens (18). As of January 2014 NCBI no
SRA (14) is a repository for raw sequence reads and longer assigns taxonomy IDs to bacterial strains that do not
alignments generated by high-throughput nucleic acid se- already have taxonomy IDs (19). Instead, such sequences
quencers. Data are deposited into SRA as supporting ev- will be assigned the taxonomy ID of the bacterial species,
idence for a wide range of study types, including de novo while the strain will be included in the source information
D14 Nucleic Acids Research, 2024, Vol. 44, Database issue

of the sequence record. In addition, the sequence record will Virus variation resource
be linked to a BioSample record that will contain strain in-
The Virus Variation Resource is an outgrowth of the In -
formation such as relevant culture collections and details
fluenza virus and Dengue virus resources and has been up-
about how the strain was isolated.
dated to include West Nile virus, Middle Eastern Respira-
tory (MERS) coronavirus and Ebolavirus (23–26). The re-
Genome reference consortium (GRC) source employs computational pipelines and manual cura-
tion to create consistent sequence annotations and meta-
The Genome Reference Consortium (GRC) data vocabularies across all sequences from constituent
(www.genomereference.org) is an international collab- viruses. These standardized data are leveraged by a special-
oration between the Wellcome Trust Sanger Institute, the ized search interface and a suite of tools designed to support
Genome Institute at Washington University, EMBL and the retrieval and display of large virus sequence datasets.
NCBI. The GRC aims to produce assemblies of higher
eukaryotic genomes that best reflect complex allelic diver-
Epigenomics
sity that is consistent with currently available data. The
GRC currently produces assemblies for human (GRCh38), The Epigenomics database collects data from studies exam-
mouse (GRCm38) and zebrafish (GRCz10). Between ining epigenetic features such as post-translational modi-
major assembly releases the GRC provides minor patch fications of histone proteins, genomic DNA methylation,
releases that provide additional sequence scaffolds that chromatin organization and the expression of non-coding
either correct errors in the assembly (fix patches) or add an regulatory RNA (27). The Epigenomics database provides
alternate loci (novel patches). GRC staff then incorporate displays (genome tracks) of the raw data (stored in the Gene
these changes into the next major assembly release. GRC expression omnibus (GEO) and SRA databases) mapped to
data are available for download from the NCBI FTP site genomic coordinates. Data from the Roadmap Epigenomics
(ftp.ncbi.nlm.nih.gov/pub/grc/) and the new genomes FTP project, currently stored in GEO (www.ncbi.nlm.nih.gov/
area (see above). geo/roadmap/epigenomics/), are being mirrored and are
available for viewing and downloading.
dbSNP
GENES
The Database of Single Nucleotide Polymorphisms (db-
Gene
SNP) (20) is a repository of all types of short genetic varia-
tions <50 bp in length, and so is a complement to dbVar (see Gene (28) provides an interface to curated sequences and
below). dbSNP accepts submissions of both common and descriptive information about genes with links to a wide
polymorphic variations, and contains both germline and so- variety of gene-related resources. These data are accumu-
matic variations. In addition to archiving molecular details lated and maintained through several international collab-
for each submission and calculating submitted variant loca- orations in addition to curation by NCBI staff. The com-
tions on each genome assembly, dbSNP maintains informa- plete Gene dataset, as well as organism-specific subsets, is
tion about population-specific allele frequencies and geno- available in the compact NCBI Abstract Syntax Notation
types, reports the validation state of each variant and indi - One (ASN.1) format on the NCBI FTP site. The gene2xml
cates if a variation call may be suspect because of paralogy tool converts the native Gene ASN.1 format into XML
(21). and is available at ftp.ncbi.nlm.nih.gov/toolbox/ncbi tools/
converters/by program/gene2xml/.

dbVar
RefSeqGene
The Database of Genomic Structural Variation (dbVar) is As part of the Locus Reference Genomic (LRG) collabora-
an archive of large-scale genomic variants (generally >50 tion (www.lrg-sequence.org), RefSeqGene provides stable,
bp) such as insertions, deletions, translocations and inver- standard human genomic sequences annotated with mR-
sions (22). These data are derived from several methods
NAs for well-characterized human genes (13). RefSeqGene
including computational sequence analysis and microarray records are part of the RefSeq collection and are used to
experiments.
establish numbering systems for exons and introns and for
reporting and identifying genomic variants, especially those
Variation viewer of clinical importance (29). RefSeqGene records can be re-
trieved from the Nucleotide database using the query ‘ref-
The Variation Viewer (www.ncbi.nlm.nih.gov/variation/ seqgene[keyword]’, are available on corresponding Gene re-
view) displays human variations from dbSNP, dbVar and ports and can be downloaded from ftp.ncbi.nlm.nih.gov/
ClinVar in the context of the current and previous human refseq/H sapiens/RefSeqGene.
reference genome assemblies, now GRCh38 and GRCh37
(23). Variation Viewer provides users with genome-wide ac-
cess to variations, both graphically and in tabular format. The conserved CDS database (CCDS)
Users can search for variants by gene symbol, variant ID, The conserved CDS database (CCDS) project is a collab-
or chromosomal coordinates, and can also upload their own orative effort between NCBI, the European Bioinformat-
data in several popular formats. ics Institute, the Wellcome Trust Sanger Institute (WTSI)
Nucleic Acids Research, 2024, Vol. 44, Database issue D15

and the University of California, Santa Cruz (UCSC). The GenBank and other sources
CCDS compiles a set of human and mouse protein coding
As part of standard submission procedures, NCBI pro-
regions that are consistently annotated and of high qual-
duces conceptual translations for any sequence in Gen-
ity (30). The collaborators prepare the CCDS set by com-
Bank that contains a coding sequence and places these
paring the annotations they have independently determined
protein sequences in the Protein database. In addition to
and then identifying those coding regions that have iden- these GenPept sequences, the Protein database also con-
tical coordinates on the genome. Those regions that pass
tains sequences from TPA, UniProtKB/Swiss-Prot (38), the
quality evaluations are then added to the CCDS set. The
Protein Research Foundation (PRF) and the Protein Data
CCDS sequence data are available at ftp.ncbi.nlm.nih.gov/
Bank (PDB) (39).
pub/CCDS/.
Molecular modeling database (MMDB)
Gene expression omnibus (GEO)
Molecular modeling database (MMDB) (40) contains ex-
GEO (31) is a data repository and retrieval system for high- perimentally determined coordinate sets from PDB (39)
throughput functional genomic data generated by microar- augmented with domain annotations and links to rele-
ray and next-generation sequencing technologies. In addi- vant literature, protein and nucleotide sequences, chemi-
tion to gene expression data, GEO accepts data from stud- cals (PDB heterogens), and conserved domains in the Con-
ies of genome copy number variation, genome-protein in- served Domain Database (CDD) (41). MMDB also pro-
teraction surveys and methylation profiling studies. The vides interactive views of the data in Cn3D (42), the NCBI
repository can capture fully annotated raw and processed structure and alignment viewer. MMDB provides structural
data, enabling compliance with reporting standards such as neighbors for each record based on similarities computed
‘Minimum Information About a Microarray Experiment’ by the VAST algorithm between compact structural do-
(MIAME) (23,24). GEO data are housed in two Entrez mains within protein structures (43,44).
databases: GEO Profiles, which contains quantitative gene
expression measurements for one gene across an experi- Conserved domain database (CDD)
ment, and GEO DataSets, which contains entire experi-
ments. CDD (45) contains PSI-BLAST-derived Position Specific
Score Matrices representing domains taken from the Sim-
ple Modular Architecture Research Tool (Smart) (46), Pfam
UniGene (47), TIGRFAM (48) and from domain alignments derived
UniGene (32) is a system for partitioning transcript se- from the Clusters of Orthologous Groups (COGs) database
quences (including ESTs) from GenBank into a non- and the Protein Clusters database. In addition, CDD in-
redundant set of clusters, each of which contains sequences cludes superfamily records that contain sets of CDs from
that seem to be produced by the same transcription locus. one or more source databases that generate overlapping an-
UniGene clusters are created for all organisms for which notation on the same protein sequences.
there are 70 000 or more ESTs in GenBank.
Protein clusters
HomoloGene The Protein Clusters database contains sets of almost
HomoloGene is a system that automatically detects ho- identical RefSeq proteins encoded by complete genomes
mologs, including paralogs and orthologs, among the genes from prokaryotes, eukaryotic organelles (mitochondria and
of 21 completely sequenced eukaryotic genomes. Homolo- chloroplasts), viruses and plasmids, as well as from some
Gene reports include homology and phenotype informa- protozoans and plants. The clusters are organized in a tax-
tion drawn from Online Mendelian Inheritance in Man onomic hierarchy and are created based on reciprocal best-
(OMIM) (33), Mouse Genome Informatics (MGI) (34), the hit protein BLAST scores (49).
Zebrafish Information Network (ZFIN) (35), the Saccha-
romyces Genome Database (SGD) (36) and FlyBase (37). HIV-1/human protein interaction database
Information about the HomoloGene build procedure is The HIV-1/Human Protein Interaction Database is an on-
provided at www.ncbi.nlm.nih.gov/HomoloGene/HTML/ line presentation of documented interactions between HIV-
homologene buildproc.html. 1 proteins, host cell proteins, other HIV-1 proteins or pro-
teins from disease organisms associated with HIV or AIDS
PROTEINS (50). These data are maintained by the Division of Acquired
Immunodeficiency Syndrome of the National Institute of
RefSeq Allergy and Infectious Diseases (NIAID) in collaboration
In addition to genomic and transcript sequences, the RefSeq with the Southern Research Institute and NCBI.
database (13) contains protein sequences that are curated
and computationally derived from these DNA and RNA se- BLAST SEQUENCE ANALYSIS
quences. RefSeq protein sequences can be searched and re-
BLAST software
trieved from the Protein database, and the complete RefSeq
collection is available in the RefSeq directory on the NCBI The BLAST programs (51–53) perform sequence-similarity
FTP site. searches against a variety of nucleotide and protein
D16 Nucleic Acids Research, 2024, Vol. 44, Database issue

databases, returning a set of gapped alignments with links to to specialized databases for each particular genome. The de-
full sequence records and related NCBI resources. The ba- fault database contains the genomic sequence of an organ-
sic BLAST programs are also available as standalone com- ism, but additional databases are provided depending on
mand line programs and network clients at ftp.ncbi.nlm.nih. the available data and annotations. The default algorithm
gov/blast/executables/LATEST/ (Table 2). for Genomic BLAST is MegaBLAST (54), a faster ver-
sion of standard nucleotide BLAST designed to find align-
BLAST on the cloud ments between nearly identical sequences, typically from
the same species. For rapid cross-species nucleotide queries,
NCBI provides an experimental Amazon Machine Im- NCBI offers Discontiguous MegaBLAST, which uses a
age (AMI) for BLAST hosted at the Amazon Web Ser -
non-contiguous word match (55) as the nucleus for its align-
vices (AWS) Marketplace. This AMI is preconfigured with
ments. Discontiguous MegaBLAST is far more rapid than
the latest BLAST+ applications and includes a FUSE
a translated search such as blastx, yet maintains a compet-
client that can download BLAST databases from NCBI as
itive degree of sensitivity when comparing coding regions.
needed. Users can also upload their own custom databases.
For tools that access NCBI BLAST programmatically, the
AMI also supports the BLAST URL API, making an AWS
instance a drop-in replacement for the NCBI BLAST web- Primer-BLAST
site. More information is available at the BLAST help page
as well as in an archived webinar about this AMI (www. Primer-BLAST is a tool for designing and analyzing poly-
merase chain reaction (PCR) primers based on the exist-
ncbi.nlm.nih.gov/education/webinars/).
ing program Primer3 (56) that designs PCR primers given a
template DNA sequence. Primer-BLAST extends this func-
BLAST databases tionality by running a BLAST search against a chosen
The default database for nucleotide BLAST searches database with the designed primers as queries, and then
(nr/nt) contains all RefSeq RNA records plus all GenBank returns only those primer pairs specific to the desired tar-
sequences except for those from the EST, GSS, STS and get. If a user provides only one primer with the DNA tem-
HTG divisions. Another featured database is Human ge- plate, the other primer will be designed and analyzed. If a
nomic plus transcript that contains human RefSeq transcript user provides both primers and a template, the tool per-
and genomic sequences arising from the NCBI annotation forms only the final BLAST analysis. If a user provides
of the human genome. A similar database is available for both primers but no template, primer-BLAST will display
mouse. Additional databases are also available and are de- those templates that best match the primer pair. The avail-
scribed in links from the BLAST search form. Each of these able databases include the RefSeq mRNA collection, the
databases can be limited to an arbitrary taxonomic node or BLAST nr database and genomic sets for one of twelve
those records satisfying any Entrez query. model organisms.
For proteins the default database (nr) is a non-redundant
set of all CDS translations from GenBank along with
all sequences from RefSeq, UniProtKB/Swiss-Prot, PDB IgBLAST
and the Protein Research Foundation (PRF). Subsets
of this database are also available, such as the PDB IgBLAST is a specialized BLAST tool that facilitates the
or UniProtKB/Swiss-Prot sequences, along with separate analysis of immunoglobulin variable domain sequences
databases for sequences from patents and environmental and T-cell receptor sequences (57). In addition to a stan-
samples. Like the nucleotide databases, these collections can dard BLAST analysis, IgBLAST reports the germline V, D
be limited by taxonomy or an arbitrary Entrez query. and J gene matches to the query sequence, annotates im-
munoglobulin domains, reveals V(D)J junction details and
BLAST output formats indicates whether the rearrangement is in-frame or out-of-
frame. IgBLAST is available both as a web tool and as a
Standard BLAST output formats include the default pair-
stand-alone package.
wise alignment, several query-anchored multiple sequence
alignment formats, an easily-parsable Hit Table and a re-
port that organizes the BLAST hits by taxonomy. A pair-
wise with identities mode better highlights differences be- COBALT
tween the query and a target sequence. A Tree View option
The Constraint-based multiple protein Alignment Tool
for the Web BLAST service creates a dendrogram that clus-
ters sequences according to their distances from the query (COBALT) (58) is a multiple alignment algorithm for pro-
sequence. Each alignment returned by BLAST is scored and teins that finds a collection of pair-wise constraints derived
from both the NCBI CDD and the sequence similarity
assigned a measure of statistical significance called the Ex-
pectation Value (E-value). The alignments returned can be programs RPS-BLAST, BLASTp and PHI-BLAST. These
limited by an E-value threshold or range. pairwise constraints are then incorporated into a progres-
sive multiple alignment. Links at the top of the COBALT
report provide access to a phylogenetic tree view of the mul-
Genomic BLAST
tiple alignment and allow users either to launch a modified
NCBI maintains Genomic BLAST services that mirror the search or download the alignment in several popular for-
design of the standard BLAST forms and allow users access mats.
Nucleic Acids Research, 2024, Vol. 44, Database issue D17

Table 2. Selected NCBI software available for download

Software Available binaries Category within this article

BLAST (standalone) Win, Mac, LINUX BLAST sequence analysis
IgBLAST (standalone) Win, Mac, LINUX BLAST sequence analysis
CD-Tree Win, Mac Proteins
Cn3D Win, Mac Proteins
PC3D Win, Mac, LINUX Chemicals
gene2xml Win, Mac, LINUX, Solaris Genes
Genome Workbench Win, Mac, LINUX Genomes
splign LINUX, Solaris Genomes
tbl2asn Win, Mac, LINUX, Solaris Genomes

CHEMICALS FUNDING
PubChem Funding for open access charge: Intramural Research Pro-
gram of the National Institutes of Health, National Library
PubChem (59,60) focuses on the chemical, structural and
of Medicine.
biological properties of small molecules, in particular their
Conflict of interest statement. None declared.
roles as diagnostic and therapeutic agents. A suite of
three Entrez databases, PCSubstance, PCCompound and
PCBioAssay, contain the structural and bioactivity data of
the PubChem project. PubChem also provides a diverse
set of three-dimensional (3D) conformers for 90% of the REFERENCES
records in the PubChem Compound database. 1. Benson,D.A., Cavanaugh,M., Clark,K., Karsch-Mizrachi,I.,
Lipman,D.J., Ostell,J. and Sayers,E.W. (2016) GenBank. Nucleic
Acids Res., doi:10.1093/nar/gkv1276.
2. Rubinstein,W.S., Maglott,D.R., Lee,J.M., Kattman,B.L.,
Biosystems Malheiro,A.J., Ovetsky,M., Hem,V., Gorelenkov,V., Song,G.,
Wallin,C. et al. (2013) The NIH genetic testing registry: a new,
The Biosystems database collects together molecules repre- centralized database of genetic tests to enable access to
sented in Gene, Protein and PubChem that interact in a bi- comprehensive information and improve transparency. Nucleic Acids
ological system, such as a biochemical pathway or disease. Res., 41, D925–D935.
Currently Biosystems receives data from the Kyoto Ency- 3. Kitts,P.A., Church,D.M., Choi,J., Hem,V., Smith,R., Tatusova,T.,
Thibaud-Nissen,F., DiCuccio,M., Murphy,T.D., Pruitt,K.D. et al.
clopedia of Genes and Genomes (KEGG) (61–63), Bio- (2016) Assembly: a resource for assembled genomes at NCBI. Nucleic
Cyc (64), Reactome (65), the Pathway Interaction Database Acids Res., doi:10.1093/nar/gkv1226.
(66), WikiPathways (67,68) and Gene Ontology (69). 4. NCBI Resource Coordinators. (2015) Database resources of the
National Center for Biotechnology Information. Nucleic Acids Res.
43, D6–D17.
5. Schuler,G.D., Epstein,J.A., Ohkawa,H. and Kans,J.A. (1996) Entrez:
FOR FURTHER INFORMATION molecular biology database and retrieval system. Methods Enzymol.,
266, 141–162.
The resources described here include documentation, other 6. Sequeira,E. (2003) PubMed Central––three years old and growing
explanatory material and references to collaborators and stronger. ARL, 228, 5–9.
7. Sewell,W. (1964) Medical Subject Headings in Medlars. Bull. Med.
data sources on their respective websites. An alphabeti- Libr. Assoc., 52, 164–170.
cal list of NCBI resources is available from a link above 8. Landrum,M.J., Lee,J.M., Riley,G.R., Jang,W., Rubinstein,W.S.,
the category list on the left side of the NCBI home page. Church,D.M. and Maglott,D.R. (2014) ClinVar: public archive of
The NCBI Help Manual and the new second edition relationships among sequence variation and human phenotype.
Nucleic Acids Res., 42, D980–D985.
of the NCBI Handbook (www.ncbi.nlm.nih.gov/books/ 9. Green,R.C., Berg,J.S., Grody,W.W., Kalia,S.S., Korf,B.R.,
NBK143764/), both available as links in the common page Martin,C.L., McGuire,A.L., Nussbaum,R.L., O’Daniel,J.M.,
footer, describe the principal NCBI resources in detail. Ormond,K.E. et al. (2013) ACMG recommendations for reporting of
The NCBI Learn page (www.ncbi.nlm.nih.gov/home/learn. incidental findings in clinical exome and genome sequencing. Genet.
shtml) provides links to documentation, tutorials, webi- Med., 15, 565–574.
10. Manolio,T.A., Rodriguez,L.L., Brooks,L., Abecasis,G., Ballinger,D.,
nars, courses and upcoming conference exhibits. A vari- Daly,M., Donnelly,P., Faraone,S.V., Frazer,K., Gabriel,S. et al.
ety of video tutorials are available on the NCBI YouTube (2007) New models of collaboration in genome-wide association
channel that can be accessed through links in the stan - studies: the Genetic Association Information Network. Nat. Genet.,
dard NCBI page footer. A user-support staff is available 39, 1045–1051.
11. Blumenfeld,O.O. and Patnaik,S.K. (2004) Allelic genes of blood
to answer questions at [email protected]. Updates on group antigens: a source of human mutations and cSNPs
NCBI resources and database enhancements are described documented in the Blood Group Antigen Gene Mutation Database.
on the NCBI News site (www.ncbi.nlm.nih.gov/news/), Hum. Mutat., 23, 8–16.
NCBI social media sites (FaceBook, Twitter, Google+ and 12. Helmberg,W., Dunivin,R. and Feolo,M. (2004) The
LinkedIn), the ‘NCBI Insights’ blog, and the several mail- sequencing-based typing tool of dbMHC: typing highly polymorphic
gene sequences. Nucleic Acids Res., 32, W173–W175.
ing lists and RSS feeds that provide updates on services and 13. Pruitt,K.D., Tatusova,T., Brown,G.R. and Maglott,D.R. (2012)
databases. Links to these resources are in the NCBI page NCBI Reference Sequences (RefSeq): current status, new features and
footer and on the NCBI News site. genome annotation policy. Nucleic Acids Res., 40, D130–D135.
D18 Nucleic Acids Research, 2024, Vol. 44, Database issue

14. Kodama,Y., Shumway,M. and Leinonen,R. (2012) The Sequence 34. Eppig,J.T., Blake,J.A., Bult,C.J., Kadin,J.A. and Richardson,J.E.
Read Archive: explosive growth of sequencing data. Nucleic Acids (2007) The mouse genome database (MGD): new features facilitating
Res., 40, D54–D56. a model system. Nucleic Acids Res., 35, D630–D637.
15. Schneider,V.A., Chen,H.C., Clausen,C., Meric,P.A., Zhou,Z., 35. Sprague,J., Bayraktaroglu,L., Clements,D., Conlin,T., Fashena,D.,
Bouk,N., Husain,N., Maglott,D.R. and Church,D.M. (2013) Clone Frazer,K., Haendel,M., Howe,D.G., Mani,P., Ramachandran,S. et al.
DB: an integrated NCBI resource for clone-associated data. Nucleic (2006) The Zebrafish Information Network: the zebrafish model
Acids Res., 41, D1070–D1078. organism database. Nucleic Acids Res., 34, D581–D585.
16. Barrett,T., Clark,K., Gevorgyan,R., Gorelenkov,V., Gribov,E., 36. Hong,E.L., Balakrishnan,R., Dong,Q., Christie,K.R., Park,J.,
Karsch-Mizrachi,I., Kimelman,M., Pruitt,K.D., Resenchuk,S., Binkley,G., Costanzo,M.C., Dwight,S.S., Engel,S.R., Fisk,D.G. et al.
Tatusova,T. et al. (2012) BioProject and BioSample databases at (2008) Gene Ontology annotations at SGD: new data sources and
NCBI: facilitating capture and organization of metadata. Nucleic annotation methods. Nucleic Acids Res., 36, D577–D581.
Acids Res., 40, D57–D63. 37. Crosby,M.A., Goodman,J.L., Strelets,V.B., Zhang,P. and
17. Federhen,S. (2012) The NCBI Taxonomy database. Nucleic Acids Gelbart,W.M. (2007) FlyBase: genomes by the dozen. Nucleic Acids
Res., 40, D136–D143. Res., 35, D486–D491.
18. Federhen,S. (2015) Type material in the NCBI Taxonomy Database. 38. Magrane,M. and Consortium,U. (2011) UniProt Knowledgebase: a
Nucleic Acids Res., 43, D1086–D1098. hub of integrated protein data. Database (Oxford), 2011, bar009.
19. Federhen,S., Clark,K., Barrett,T., Parkinson,H., Ostell,J., 39. Berman,H., Henrick,K., Nakamura,H. and Markley,J.L. (2007) The
Kodama,Y., Mashima,J., Nakamura,Y., Cochrane,G. and worldwide Protein Data Bank (wwPDB): ensuring a single, uniform
Karsch-Mizrachi,I. (2014) Toward richer metadata for microbial archive of PDB data. Nucleic Acids Res., 35, D301–D303.
sequences: replacing strain-level NCBI taxonomy taxids with 40. Madej,T., Addess,K.J., Fong,J.H., Geer,L.Y., Geer,R.C.,
BioProject, BioSample and Assembly records. Stand. Genomic Sci., 9, Lanczycki,C.J., Liu,C., Lu,S., Marchler -Bauer,A., Panchenko,A.R.
1275–1277. et al. (2012) MMDB: 3D structures and macromolecular interactions.
20. Sherry,S.T., Ward,M.H., Kholodov,M., Baker,J., Phan,L., Nucleic Acids Res., 40, D461–D464.
Smigielski,E.M. and Sirotkin,K. (2001) dbSNP: the NCBI database 41. Marchler-Bauer,A., Derbyshire,M.K., Gonzales,N.R., Lu,S.,
of genetic variation. Nucleic Acids Res., 29, 308–311. Chitsaz,F., Geer,L.Y., Geer,R.C., He,J., Gwadz,M., Hurwitz,D.I.
21. Sudmant,P.H., Kitzman,J.O., Antonacci,F., Alkan,C., Malig,M., et al. (2015) CDD: NCBI’s conserved domain database. Nucleic Acids
Tsalenko,A., Sampas,N., Bruhn,L., Shendure,J. and Eichler,E.E. Res., 43, D222–D226.
(2010) Diversity of human copy number variation and multicopy 42. Wang,Y., Geer,L.Y., Chappey,C., Kans,J.A. and Bryant,S.H. (2000)
genes. Science, 330, 641–646. Cn3D: sequence and structure views for Entrez. Trends Biochem. Sci.,
22. Church,D.M., Lappalainen,I., Sneddon,T.P., Hinton,J., Maguire,M., 25, 300–302.
Lopez,J., Garner,J., Paschall,J., Dicuccio,M., Yaschenko,E. et al. 43. Gibrat,J.F., Madej,T. and Bryant,S.H. (1996) Surprising similarities
(2010) Public data archives for genomic structural variation. Nat. in structure comparison. Curr. Opin. Struct. Biol., 6, 377–385.
Genet., 42, 813–814. 44. Madej,T., Gibrat,J.F. and Bryant,S.H. (1995) Threading a database of
23. Brister,J.R., Bao,Y., Zhdanov,S.A., Ostapchuck,Y., Chetvernin,V., protein cores. Proteins, 23, 356–369.
Kiryutin,B., Zaslavsky,L., Kimelman,M. and Tatusova,T.A. (2014) 45. Marchler-Bauer,A., Anderson,J.B., Chitsaz,F., Derbyshire,M.K.,
Virus Variation Resource–recent updates and future directions. DeWeese-Scott,C., Fong,J.H., Geer,L.Y., Geer,R.C., Gonzales,N.R.,
Nucleic Acids Res., 42, D660–D665. Gwadz,M. et al. (2009) CDD: specific functional annotation with the
24. Brister,J.R., Ako-Adjei,D., Bao,Y. and Blinkova,O. (2015) NCBI viral Conserved Domain Database. Nucleic Acids Res., 37, D205–D210.
genomes resource. Nucleic Acids Res., 43, D571–D577. 46. Letunic,I., Copley,R.R., Pils,B., Pinkert,S., Schultz,J. and Bork,P.
25. Resch,W., Zaslavsky,L., Kiryutin,B., Rozanov,M., Bao,Y. and (2006) SMART 5: domains in the context of genomes and networks.
Tatusova,T.A. (2009) Virus variation resources at the National Center Nucleic Acids Res., 34, D257–D260.
for Biotechnology Information: dengue virus. BMC Microbiol., 9, 47. Finn,R.D., Mistry,J., Schuster-Bockler,B., Griffiths-Jones,S.,
65–71. Hollich,V., Lassmann,T., Moxon,S., Marshall,M., Khanna,A.,
26. Zaslavsky,L., Bao,Y. and Tatusova,T.A. (2008) Visualization of large Durbin,R. et al. (2006) Pfam: clans, web tools and services. Nucleic
influenza virus sequence datasets using adaptively aggregated trees Acids Res., 34, D247–D251.
with sampling-based subscale representation. BMC Bioinformatics, 9, 48. Haft,D.H., Selengut,J.D. and White,O. (2003) The TIGRFAMs
237–243. database of protein families. Nucleic Acids Res., 31, 371–373.
27. Fingerman,I.M., McDaniel,L., Zhang,X., Ratzat,W., Hassan,T., 49. Klimke,W., Agarwala,R., Badretdin,A., Chetvernin,S., Ciufo,S.,
Jiang,Z., Cohen,R.F. and Schuler,G.D. (2011) NCBI Epigenomics: a Fedorov,B., Kiryutin,B., O’Neill,K., Resch,W., Resenchuk,S. et al.
new public resource for exploring epigenomic data sets. Nucleic Acids (2009) The National Center for Biotechnology Information’s Protein
Res., 39, D908–D912. Clusters Database. Nucleic Acids Res., 37, D216–D223.
28. Brown,G.R., Hem,V., Katz,K.S., Ovetsky,M., Wallin,C., 50. Fu,W., Sanders-Beer,B.E., Katz,K.S., Maglott,D.R., Pruitt,K.D. and
Ermolaeva,O., Tolstoy,I., Tatusova,T., Pruitt,K.D., Maglott,D.R. Ptak,R.G. (2009) Human immunodeficiency virus type 1, human
et al. (2015) Gene: a gene-centered information resource at NCBI. protein interaction database at NCBI. Nucleic Acids Res., 37, D417–
Nucleic Acids Res., 43, D36–D42. D422.
29. Gulley,M.L., Braziel,R.M., Halling,K.C., Hsi,E.D., Kant,J.A., 51. Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J.
Nikiforova,M.N., Nowak,J.A., Ogino,S., Oliveira,A., Polesky,H.F. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410.
et al. (2007) Clinical laboratory reports in molecular pathology. Arch. 52. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z.,
Pathol. Lab. Med., 131, 852–863. Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST:
30. Farrell,C.M., O’Leary,N.A., Harte,R.A., Loveland,J.E., a new generation of protein database search programs. Nucleic Acids
Wilming,L.G., Wallin,C., Diekhans,M., Barrell,D., Searle,S.M., Res., 25, 3389–3402.
Aken,B. et al. (2014) Current status and new features of the 53. Boratyn,G.M., Camacho,C., Cooper,P.S., Coulouris,G., Fong,A.,
Consensus Coding Sequence database. Nucleic Acids Res., 42, Ma,N., Madden,T.L., Matten,W.T., McGinnis,S.D., Merezhuk,Y.
D865–D872. et al. (2013) BLAST: a more efficient report with usability
31. Barrett,T., Wilhite,S.E., Ledoux,P., Evangelista,C., Kim,I.F., improvements. Nucleic Acids Res., 41, W29–W33.
Tomashevsky,M., Marshall,K.A., Phillippy,K.H., Sherman,P.M., 54. Zhang,Z., Schwartz,S., Wagner,L. and Miller,W. (2000) A greedy
Holko,M. et al. (2013) NCBI GEO: archive for functional genomics algorithm for aligning DNA sequences. J. Comput. Biol., 7, 203–214.
data sets–update. Nucleic Acids Res., 41, D991–D995. 55. Ma,B., Tromp,J. and Li,M. (2002) PatternHunter: faster and more
32. Schuler,G.D. (1997) Pieces of the puzzle: expressed sequence tags and sensitive homology search. Bioinformatics, 18, 440–445.
the catalog of human genes. J. Mol. Med., 75, 694–698. 56. Rozen,S. and Skalestsky,H.J. (2000) Primer3 on the WWW for
33. Amberger,J., Bocchini,C.A., Scott,A.F. and Hamosh,A. (2009) general users and for biologist programmers. In: Krawetz,S and
McKusick’s Online Mendelian Inheritance in Man (OMIM). Nucleic Misener,S (eds). Bioinformatics Methods and Protocols: Methods in
Acids Res., 37, D793–D796. Molecular Biology. Humana Press, Totowa, NJ, pp. 365–386.
Nucleic Acids Research, 2016, Vol. 44, Database issue D19

57. Ye,J., Ma,N., Madden,T.L. and Ostell,J.M. (2013) IgBLAST: an 68. Pico,A.R., Kelder,T., van Iersel,M.P., Hanspers,K., Conklin,B.R.
immunoglobulin variable domain sequence analysis tool. Nucleic and Evelo,C. (2008) WikiPathways: pathway editing for the people.
Acids Res., 41, W34–W40. PLoS Biol., 6, e184.
58. Papadopoulos,J.S. and Agarwala,R. (2007) COBALT: 69. Ashburner,M., Ball,C.A., Blake,J.A., Botstein,D., Butler,H.,
constraint-based alignment tool for multiple protein sequences. Cherry,J.M., Davis,A.P., Dolinski,K., Dwight,S.S., Eppig,J.T. et al.
Bioinformatics, 23, 1073–1079. (2000) Gene ontology: tool for the unification of biology. The Gene
59. Wang,Y., Xiao,J., Suzek,T.O., Zhang,J., Wang,J. and Bryant,S.H. Ontology Consortium. Nat. Genet., 25, 25–29.
(2009) PubChem: a public information system for analyzing
bioactivities of small molecules. Nucleic Acids Res., 37, W623–W633.
60. Wang,Y., Xiao,J., Suzek,T.O., Zhang,J., Wang,J., Zhou,Z., Han,L., APPENDIX
Karapetyan,K., Dracheva,S., Shoemaker,B.A. et al. (2012) NCBI Resource Coordinators: Richa Agarwala, Tanya
PubChem’s BioAssay Database. Nucleic Acids Res., 40, D400–D412.
61. Kanehisa,M., Araki,M., Goto,S., Hattori,M., Hirakawa,M., Itoh,M., Barrett, Jeff Beck, Dennis A Benson, Colleen Bollin, Evan
Katayama,T., Kawashima,S., Okuda,S., Tokimatsu,T. et al. (2008) Bolton, Devon Bourexis, J Rodney Brister, Stephen H
KEGG for linking genomes to life and the environment. Nucleic Bryant, Kathi Canese, Chad Charowhas, Karen Clark,
Acids Res., 36, D480–D484. Michael DiCuccio, Ilya Dondoshansky, Scott Federhen,
62. Kanehisa,M. and Goto,S. (2000) KEGG: kyoto encyclopedia of
genes and genomes. Nucleic Acids Res., 28, 27–30.
Michael Feolo, Kathryn Funk, Lewis Y Geer, Viatch-
63. Kanehisa,M., Goto,S., Hattori,M., Aoki-Kinoshita,K.F., Itoh,M., eslav Gorelenkov, Marilu Hoeppner, Brad Holmes, Mark
Kawashima,S., Katayama,T., Araki,M. and Hirakawa,M. (2006) Johnson, Viatcheslav Khotomlianski, Avi Kimchi, Michael
From genomics to chemical genomics: new developments in KEGG. Kimelman, Paul Kitts, William Klimke, Sergey Krasnov,
Nucleic Acids Res., 34, D354–D357. Anatoliy Kuznetsov, Melissa J Landrum, David Landsman,
64. Keseler,I.M., Bonavides-Martinez,C., Collado-Vides,J.,
Gama-Castro,S., Gunsalus,R.P., Johnson,D.A., Krummenacker,M., Jennifer M Lee, David J Lipman, Zhiyong Lu, Thomas L
Nolan,L.M., Paley,S., Paulsen,I.T. et al. (2009) EcoCyc: a Madden, Tom Madej, Aron Marchler-Bauer, Ilene Karsch-
comprehensive view of Escherichia coli biology. Nucleic Acids Res., Mizrachi, Terence Murphy, Rebecca Orris, James Ostell,
37, D464–D470. Christopher O’Sullivan, Anna Panchenko, Lon Phan, Don
65. Matthews,L., Gopinath,G., Gillespie,M., Caudy,M., Croft,D., de
Bono,B., Garapati,P., Hemish,J., Hermjakob,H., Jassal,B. et al.
Preuss, Kim D Pruitt, Kurt Rodarmer, Wendy Rubin-
(2009) Reactome knowledgebase of human biological pathways and stein, Eric W Sayers, Valerie Schneider, Gregory D Schuler,
processes. Nucleic Acids Res., 37, D619–D622. Stephen T Sherry, Karl Sirotkin, Karanjit Siyan, Dou-
66. Schaefer,C.F., Anthony,K., Krupa,S., Buchoff,J., Day,M., Hannay,T. glas Slotta, Alexandra Soboleva, Vladimir Soussov, Grig-
and Buetow,K.H. (2009) PID: the Pathway Interaction Database. ory Starchenko, Tatiana A Tatusova, Kamen Todorov, Bart
Nucleic Acids Res., 37, D674–D679.
67. Kelder,T., Pico,A.R., Hanspers,K., van Iersel,M.P., Evelo,C. and W Trawick, Denis Vakatov, Yanli Wang, Minghong Ward,
Conklin,B.R. (2009) Mining biological pathways using WikiPathways W John Wilbur, Eugene Yaschenko, Kerry Zbicz.
web services. PLoS One, 4, e6447.

Health Optimizing Physical Education 1: Quarters 1 and 2 - Module 1: The Healthiest and Fittest ME
93% (14)
Health Optimizing Physical Education 1: Quarters 1 and 2 - Module 1: The Healthiest and Fittest ME
30 pages
gkv1290
No ratings yet
gkv1290
13 pages
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
No ratings yet
Biological Sequence Databases: A. National Center For Biotechnology Information (NCBI)
41 pages
NCBI Resources
No ratings yet
NCBI Resources
13 pages
gkae979
No ratings yet
gkae979
10 pages
Bio Info
No ratings yet
Bio Info
12 pages
Exp 1
No ratings yet
Exp 1
7 pages
Genbank: National Center For Biotechnology Information
No ratings yet
Genbank: National Center For Biotechnology Information
5 pages
Ncbi Dulu
No ratings yet
Ncbi Dulu
6 pages
Comp Bio Lab File
No ratings yet
Comp Bio Lab File
43 pages
Bookshelf NBK21101
100% (1)
Bookshelf NBK21101
451 pages
Advanced Cellular Biology
No ratings yet
Advanced Cellular Biology
50 pages
System Biology Assignment
No ratings yet
System Biology Assignment
17 pages
2
No ratings yet
2
2 pages
David L. Wheeler Et Al - Database Resources of The National Center For Biotechnology Information
No ratings yet
David L. Wheeler Et Al - Database Resources of The National Center For Biotechnology Information
8 pages
National Center For Biotechnology Information
No ratings yet
National Center For Biotechnology Information
4 pages
Bioinfi U3 Part -1
No ratings yet
Bioinfi U3 Part -1
4 pages
Index: Auroras Technological and Research Institute
No ratings yet
Index: Auroras Technological and Research Institute
56 pages
genomics & proteomics
No ratings yet
genomics & proteomics
22 pages
Bibliografia On Line2009
No ratings yet
Bibliografia On Line2009
8 pages
genomics
No ratings yet
genomics
24 pages
Ncbi: A Major Bioinformatics Resource: Direct Link
No ratings yet
Ncbi: A Major Bioinformatics Resource: Direct Link
1 page
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
No ratings yet
Bio-Informatics, Its Application S& Ncbi: Submitted By: Sidhant Oberoi (BTF/09/4038)
9 pages
CH12
No ratings yet
CH12
8 pages
Manual
No ratings yet
Manual
68 pages
ok
No ratings yet
ok
29 pages
I Am Sharing 'Document (2) ' With You
No ratings yet
I Am Sharing 'Document (2) ' With You
36 pages
Welcome To NCBI: Get Started
No ratings yet
Welcome To NCBI: Get Started
2 pages
Databases
No ratings yet
Databases
2 pages
Bioinformatics Database and Applications
100% (3)
Bioinformatics Database and Applications
82 pages
Gen Bank
No ratings yet
Gen Bank
6 pages
The National Center For Biotechnology Information
No ratings yet
The National Center For Biotechnology Information
15 pages
Database
No ratings yet
Database
40 pages
Lecture 4-Entrez-Biological Information Repository.
No ratings yet
Lecture 4-Entrez-Biological Information Repository.
10 pages
02. Biological Sequence Databases
No ratings yet
02. Biological Sequence Databases
35 pages
Entrez
No ratings yet
Entrez
46 pages
Introduction to Bioinformatics Using Action Labs
From Everand
Introduction to Bioinformatics Using Action Labs
Jean-Louis Lassez
5/5 (1)
Labmanual CS 1
No ratings yet
Labmanual CS 1
52 pages
Bioinfo U3 Part 2
No ratings yet
Bioinfo U3 Part 2
3 pages
(Chapman & Hall - CRC Computational Biology Series) Hamid Ismail - Bioinformatics - A Practical Guide To NCBI Databases and Sequence Alignments-CRC Press (2021)
No ratings yet
(Chapman & Hall - CRC Computational Biology Series) Hamid Ismail - Bioinformatics - A Practical Guide To NCBI Databases and Sequence Alignments-CRC Press (2021)
469 pages
Bioinformatics
No ratings yet
Bioinformatics
55 pages
Bi0505 Lab
No ratings yet
Bi0505 Lab
102 pages
GKM 929
No ratings yet
GKM 929
7 pages
module 4 merged
No ratings yet
module 4 merged
283 pages
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
No ratings yet
Bioinform-Tica-Pdf-May-6-2010-12-38-Pm-3-5-Meg
105 pages
Bio PPT
No ratings yet
Bio PPT
35 pages
Aim:-To Familiarize With Databases (Ncbi, Swissprot, Embl, DDBJ)
No ratings yet
Aim:-To Familiarize With Databases (Ncbi, Swissprot, Embl, DDBJ)
8 pages
202 07 Bioinformatics
No ratings yet
202 07 Bioinformatics
14 pages
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
No ratings yet
Bioinformatics Tools For Nucleotide Sequence Analysis and Database Exploration
75 pages
Blast Introduction
No ratings yet
Blast Introduction
42 pages
Literature Database
No ratings yet
Literature Database
37 pages
Experiment - 01
No ratings yet
Experiment - 01
26 pages
Factsheet: Genome Database
No ratings yet
Factsheet: Genome Database
4 pages
BioinfoMethods I Lab01
No ratings yet
BioinfoMethods I Lab01
19 pages
BLAST Homepage and Selected Search Pages: Background
No ratings yet
BLAST Homepage and Selected Search Pages: Background
8 pages
GenBank Overview
No ratings yet
GenBank Overview
2 pages
gkp1024
No ratings yet
gkp1024
6 pages
Module1 Understanding Bioinformatics
No ratings yet
Module1 Understanding Bioinformatics
28 pages
Bioinformatics: Merging Biology and Technology
From Everand
Bioinformatics: Merging Biology and Technology
Mani Devar
No ratings yet
Bioinformatics Unveiled
From Everand
Bioinformatics Unveiled
Joan Melody
No ratings yet
Bioinformatics: Algorithms, Coding, Data Science And Biostatistics
From Everand
Bioinformatics: Algorithms, Coding, Data Science And Biostatistics
Rob Botwright
No ratings yet
Quality Standards For Sample Processing, Transportation, and Storage in Hemostasis Testing
No ratings yet
Quality Standards For Sample Processing, Transportation, and Storage in Hemostasis Testing
11 pages
(ANSWER) F5 CHE C1 REDOX ESSAY
No ratings yet
(ANSWER) F5 CHE C1 REDOX ESSAY
12 pages
Customer Perception
No ratings yet
Customer Perception
19 pages
Koleksi Soalan Matematik 2010
No ratings yet
Koleksi Soalan Matematik 2010
10 pages
HIV and Pregnancy: Fact Sheets
No ratings yet
HIV and Pregnancy: Fact Sheets
11 pages
Mudra Therapy For Diabetes
No ratings yet
Mudra Therapy For Diabetes
3 pages
Chapter 7 Matriculation STPM
100% (2)
Chapter 7 Matriculation STPM
57 pages
SUPER 100 PHASE 1 TEST ZULU BATCH
No ratings yet
SUPER 100 PHASE 1 TEST ZULU BATCH
16 pages
Nickel Base Reformer Tubes
No ratings yet
Nickel Base Reformer Tubes
8 pages
U.S. Marines in Battle Al-Khafji
100% (4)
U.S. Marines in Battle Al-Khafji
40 pages
SEL Manufacturing Co. LTD: Proud Past, Promising Future
100% (1)
SEL Manufacturing Co. LTD: Proud Past, Promising Future
15 pages
The Higher Road: Forging A U.S. Strategy For The Global Infrastructure Challenge
No ratings yet
The Higher Road: Forging A U.S. Strategy For The Global Infrastructure Challenge
4 pages
Đề Cương Ôn Thi Học Kì 2 Lớp 5 Tiếng Anh 2022 2023
No ratings yet
Đề Cương Ôn Thi Học Kì 2 Lớp 5 Tiếng Anh 2022 2023
21 pages
Leather Final
100% (1)
Leather Final
58 pages
Electric For 129
No ratings yet
Electric For 129
8 pages
Rate Indian Brands
No ratings yet
Rate Indian Brands
6 pages
Love Always, Kate (Love Always - D.Nichole King PDF
0% (1)
Love Always, Kate (Love Always - D.Nichole King PDF
763 pages
Cable Route Tracer: User Manual
No ratings yet
Cable Route Tracer: User Manual
36 pages
KSL All Menu
No ratings yet
KSL All Menu
6 pages
DS MTE Column-Internals-1
No ratings yet
DS MTE Column-Internals-1
16 pages
Actividad 2 Ensayo en Ingles Dia de La Tierra Relacionado Con El Covid 19 Semana 23 Al 27 de Marzo PDF
No ratings yet
Actividad 2 Ensayo en Ingles Dia de La Tierra Relacionado Con El Covid 19 Semana 23 Al 27 de Marzo PDF
5 pages
Certified Lawnmower Manual
No ratings yet
Certified Lawnmower Manual
56 pages
Calculation of Areas in Surveying
100% (1)
Calculation of Areas in Surveying
4 pages
SDV Airlink India
No ratings yet
SDV Airlink India
13 pages
Hypertensive Crisis
100% (1)
Hypertensive Crisis
23 pages
Commodity Systems Analysis of Corn 1
No ratings yet
Commodity Systems Analysis of Corn 1
46 pages
Policy1 2023986
No ratings yet
Policy1 2023986
2 pages
Rock Burts Pap319
No ratings yet
Rock Burts Pap319
10 pages
10 Set SQP Maths Puc I Year
100% (1)
10 Set SQP Maths Puc I Year
33 pages

Intro Client Update Latest

Uploaded by

Intro Client Update Latest

Uploaded by

Published online 1 March 2025 Nucleic Acids Research, 2024, Vol.

44, Database issue D7–D19

Database resources of the National Center for

ABSTRACT national collaboration with the DNA Data Bank of Japan

Published by Oxford University Press on behalf of Nucleic Acids Research 2015.

A new search box on the BLAST home page makes it easy

Table 1. The Entrez Databases (as of 1September 2015)

Database Records Section within this article Data source1

The Assembly database (3) collects metadata about genome

Table 2. Selected NCBI software available for download

Software Available binaries Category within this article

You might also like