0% found this document useful (0 votes)
15 views10 pages

1 Databases

ppt

Uploaded by

asmaarshad2626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views10 pages

1 Databases

ppt

Uploaded by

asmaarshad2626
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

WHAT is a database?

• A collection of data that needs to be:


– Structured
– Searchable
– Updated (periodically)
– Cross referenced

• Challenge:
– To change “meaningless” data into useful information that can be
accessed and analysed the best way possible.

For example:
HOW would YOU organise all biological sequences so that the
biological information is optimally accessible?

You need an appropriate database management system (DBMS)


DBMS

• Internal organization
Database
– Controls speed and
flexibility

• A unity of programs that


– Store Store Extract Modify

– Extract
– Modify
USER(S)
DBMS organisation types

• Flat file databases (flat DBMS)


– Simple, restrictive, table
• Hierarchical databases (hierarchical DBMS)
– Simple, restrictive, tables
• Relational databases (RDBMS)
– Complex,versatile, tables
• Object-oriented databases (ODBMS)
– Complex, versatile, objects
Relational Databases
• What have we achieved?
– No repeating information
– Less storage space
– Better reality representation
– Easy modification/management
– Easy usage of any combination of records

Remember
the DBMS has programs to access and edit this
information so ignore the human reading limitation of
the primary keys
Accessing database information
• A request for data from a database is
called a query

• Queries can be of three forms:


– Choose from a list of parameters
– Query by example (QBE)
– Query language
Query by Example (QBE) reports allows end users to query, insert, update, and delete
values into a database table or view.
In the QBE build wizard, you choose which data to display in the report. Or, you can
allow end users to make their own queries in the QBE report's customization form.
Because the QBE system formulates the actual query, QBE is easier to learn than
formal query languages, such as the standard Structured Query Language (SQL).
Distributed databases
• From local to global attitude
• Data appears to be in one location but is most definitely
not

• A definition: Two or more data files in different locations,


periodically synchronized by the DBMS to keep data in
all locations consistent (A,B,C)

• An intricate network for combining and sharing


information
• Administrators praise fast network technologies!!!
• Users praise the internet!!!
Three main Points
• Database proliferation
– Dozens to hundreds at the moment
• More and more scientific discoveries result
from inter-database analysis and mining
• Rising complexity of required data-
combinations
– E.g. translational medicine: “from bench to
bedside” (genomic data vs. clinical data)

Proliferation = great and rapid increase in numbers; Grid = a network of evenly


space horizontal and vertical lines (rooster);
Semantic = related to the meaning;
Biological databases
• Like any other database
– Data organization for optimal analysis

• Data is of different types


– Raw data (DNA, RNA, protein sequences)
– Curated data (DNA, RNA and protein
annotated sequences and structures,
expression data)
A few biological databases
• Nucleotide Databases
Alternative Splicing, EMBL-Bank, Ensembl, Genomes Server, Genome,
MOT, EMBL-Align, Simple Queries, dbSTS Queries, Parasites, Mutations,
IMGT
• Genome Databases
Human, Mouse, Yeast, C.elegans, FLYBASE, Parasites
• Protein Databases
Swiss-Prot, TrEMBL, InterPro, CluSTr, IPI, GOA, GO, Proteome Analysis,
HPI, IntEnz, TrEMBLnew, SP_ML, NEWT, PANDIT
• Structure Databases
PDB, MSD, FSSP, DALI
• Microarray Database
ArrayExpress
• Literature Databases
MEDLINE, Software Biocatalog, Flybase Archives
• Alignment Databases
BAliBASE, Homstrad, FSSP
A short word on problems
• Even today we face some key limitations
– There is no standard format
• Every database or program has its own format
– There is no standard nomenclature
• Every database has its own names
– Data is not fully optimized
• Some datasets have missing information without indications
of it
– Data errors
• Data is sometimes of poor quality, erroneous, misspelled
• Error propagation resulting from computer annotation

You might also like