LO4 Access To Sequenced Data and Related Information
LO4 Access To Sequenced Data and Related Information
Prepared by:
Joseph Martin Q. Paet
Biology Department, College of Science
Bicol University
High-throughput technologies
like PCR, sequencing, and
molecular assays
1
Centralized Databases Store DNA Sequences
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
2
Integration of Biological Databases
Tier 1 Tier 2 Tier 3
Challenges
Database architecture = similar structure
How to access & what can be accessed = data surfing
Naming system (S. cerevisiae RAD24 =rad17 in S. pombe)
Clash of concepts = definitions of terms (definition of GENE)
Stein, L. D., et al. (2003). Integrating Biological Databases. Nature Reviews, 4, 337-345. doi: https://doi.org/10.1038/nrg1065
Link Integration
▪ researchers begin their query with one data
source and then follow hypertext links to related
information in other data sources
▪ Vulnerable to naming clashes and ambiguities,
updates, researcher-dependent
View Integration
▪ leaves the information in its source databases but
builds an environment around the databases that
makes them all seem to be part of one large
system
▪ didn’t perform as well as the source database
Data Warehousing
▪ bringing all the data under one roof in a single
database
▪ Issue on keeping the data warehouse up to date
Stein, L. D., et al. (2003). Integrating Biological Databases. Nature Reviews, 4, 337-345. doi: https://doi.org/10.1038/nrg1065
3
Contents of Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
Contents of Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
4
Types of Biological Data
Genomic Databases
High-Throughput Genomic
Sequence (HTGS) = contains
unfinished DNA sequences from
sequencing centers
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
RNA Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
10
5
Types of Biological Data
Protein Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
11
Protein Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
12
6
Central Bioinformatics Resource
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
13
14
7
Access to Information
Entrez
a molecular biology
database system that
provides integrated
access to databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
15
Access to Information
Searching Databases
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
16
8
Access to Information
Accession Numbers
a string of about 4–12 numbers
and/or alphabetic characters
that are associated with a
molecular sequence
record/expression/structure
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
17
Access to Information
18
9
Access to Information
Format Display
General Description
View Options
Related Literature
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
15) National Center for Biotechnology Information (NCBI) (n.d.). Nucleotide. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/nuccore
19
Access to Information
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
20
10
Access to Information
Genome Browsers
databases with a graphical interface representing sequence information and other data as
a function of position across the chromosomes. Principal genome browsers are:
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
21
References:
National Center for Biotechnology Information (NCBI) (n.d.). Bookshelf. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/books/
National Center for Biotechnology Information (NCBI) (n.d.). Guide. Retrieved July 16, 2023 from https://ncbi.nlm.nih.gov/guide/all/
National Center for Biotechnology Information (NCBI) (n.d.). Nucleotide. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/nuccore
National Center for Biotechnology Information (NCBI) (n.d.). OMIM. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/omim
National Center for Biotechnology Information (NCBI) (n.d.). Structure. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/structure
National Center for Biotechnology Information (NCBI) (n.d.). Taxonomy. Retrieved July 16, 2023 from https://www.ncbi.nlm.nih.gov/taxonomy
Pevsner, J. (2015). Bioinformatics and Functional Genomics (3rd ed.). John Wiley & Sons Inc.
Stein, L. D., et al. (2003). Integrating Biological Databases. Nature Reviews, 4, 337-345. doi: https://doi.org/10.1038/nrg1065
Prepared by:
Joseph Martin Q. Paet
Biology Department, College of Science
Bicol University
22
11