DE Module1
DE Module1
DE Module1
Database Applications:
Databases are widely used in various areas as below:
o Banking: all transactions information.
o Airlines: reservations, schedules information.
o Universities: registration, grades information.
o Sales: customers, products, purchases information.
o Online retailers: order tracking, customized recommendations
o Manufacturing: production, inventory, orders, supply chain
o Human resources: employee records, salaries, tax deductions.
o Credit card transactions: For purchases on credit cards and generation of monthly statements.
o Telecommunication: For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the communication networks.
o Finance: For storing information about holdings, sales, and purchases of financial instruments
such as stocks and bonds.
1
Integrity problems
Atomicity of updates
Concurrent access by multiple users
Security problems
2
Advantages of Using the DBMS Approach
i. Controlling Redundancy and saving spaces.
ii. Restricting Unauthorized Access
iii. Inconsistency is avoided as all changes are affected at one site.
iv. Providing Storage Structures and Search Techniques for Efficient Query Processing
v. Providing Backup and Recovery
vi. Providing Multiple User Interfaces.
vii. Providing data sharing.
viii. Representing Complex Relationships among Data
ix. Enforcing Integrity so that data fetched will be correct.
Limitations of DBMS
i. High initial investment in hardware, software, and training.
ii. Overhead for providing security, concurrency control, recovery, and integrity functions
DATA MODELS:
A data model is a collection of conceptual tools that describe the data, their relationships and
consistency constraints.
5
6
ARCHITECTURE
There are two ways to view the architecture of a DBMS
i. Logical DBMS architecture
It deals with the way data is stored and presented to the user.
ii. Physical DBMS architecture
It is concerned with the software components that make up the DBMS.
Schema
The overall design or description of a database is the database schema, which is specified during the database
design and is not expected to change frequently.
Instance
The collection of the data stored in the database at a particular moment of time is known as a database instance
or database state or snapshot.
DATA ABSTRACTION
The main purpose of database system is to provide users with an abstract view of data ie hides the details of
how data are stored and maintained(complexities) through several levels of abstraction.
The ANSI / SPARC model divides the system into three levels of abstractions.
i. Physical or Internal level.
It is the lowest level of abstraction describes how the data are actually stored.
The physical level describes complex low-level data structures in detail.
It is expressed by an internal schema.
ii. Logical or Conceptual level.
The next-higher level of abstraction describes what data are stored in the database, and what
relationships exist among those data.
7
It presents a logical view of the entire database as a unified whole.
It is expressed by a conceptual schema, hence there is only one conceptual schema per
database.
DBMS provides a DDL (Data Definition Language) for this which defines the content only.
iii. View or External level
It is the highest level of abstraction which allows the user to see only the data of their own
interest.
There can be any number of external views, each described by an external schema.
Thus the interactions are simplified and the system may provide many views for the same
database.
8
ii. External / Conceptual mapping:
It defines the correspondence between a particular external view and conceptual level.
If the structure of the database at the conceptual level is changed, then this mapping must be
changed accordingly, so that the view from external level remains constant.
DATA INDEPENDENCE
The three level of abstraction, along with the mapping provide two distinct levels of data independence.
The ability to modify a schema at one level without changing the schema at the next higher level is known as
data independence. It is of two types:
i. Logical data independence:
It is the ability to modify the conceptual or global schema without changing the external or user
schema or application programs. The change is absorbed by the external / conceptual mapping.
ii. Physical data independence :
It is the ability to modify the physical or internal schema without changing the conceptual (or
external) schema or application programs. The change is absorbed by the external / conceptual
mapping.
9
Example:
Name – string
Conceptual Rollno-number
view Mark – number
Address- string
Physical view Name-string of length 25, starting address xxx and offset xxx
Rollno – number without decimal, starting address xxx and offset xxx
Mark- number with decimal, starting address xxx and offset xxx
Address- string of length 50, starting address xxx and offset xxx
10
i. DML Precompiler :
Data manipulation Language(DML) defines the set of commands that modify and process data for
output. The DML Precompiler converts DML statements embedded in application program, to
normal procedural calls in the host language. It interacts with query processor to generate the code.
v. Query Processor:
It changes the query statement from the English like syntax into a DBMS understandable form. It
usually consists of two parts
a. Parser
It checks the syntax of the statements, by breaking it into basic units. It ensures that each
statement consists of proper component parts.
11
b. Query Optimizer
It tries to choose the best and most efficient way of executing the query, by generating several
query plans(ie arranging the order of operations) and trying to estimate which plan will be
executed most efficiently. The factors taken into consideration are CPU time, disk time, network
time, sorting time and scan methods.
DATABASE USERS
A primary goal of a database system is to retrieve information from and store new information in the
database. People who work with a database can be categorized as database users or database
administrators.
There are four different types of database-system users, differentiated by the waythey expect to interact
with the system.
1. Naive users :
They are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously.
For example, a bank teller who needs to transfer $50 from account A to account B
invokes a program called transfer. This program asks the teller for the amount of money to be
transferred, the account from which the money is to be transferred, and the account to which the money
is to be transferred.
Examples, people accessing database over the web, bank tellers, clerical staff.
The typical user interface for naive users is a forms interface, where the user can fill in
appropriate fields of the form. Naive users may also simply read reports generated from the
database.
2. Application programmers :
They are computer professionals who write application programs.
12
Application programmers can choose from many tools to develop user interfaces.
Rapid application development (RAD) tools are tools that enable an application programmer
to construct forms and reports without writing a program.
They must be familiar with the DBMSs to accomplish their task.
3. Sophisticated users :
They interact with the system without writing programs.
They form their requests in a database query language.
They submit each such query to a query processor, whose function is to break down DML
statements into instructions that the storage manager understands.
Analysts who submit queries to explore data in the database fall in this category.
Online analytical processing (OLAP) tools simplify analysts’ tasks by letting them view
summaries of data in different ways.
For instance, an analyst can see total sales by region or by product, or by a combination
of region and product.
4. Specialized users :
They are sophisticated users who write specialized database applications that do not fit into the
traditional data-processing framework.
Among these applications are computer-aided design systems, knowledge base and expert
systems, systems that store data with complex data types etc.
Database Administrator(DBA)
Centralized control of the database is exerted by a person or group of persons, under the supervision
of a high level administrator known as Database Administrator. They are responsible for creating,
modifying and maintaining the three database levels. The basic functions of DBA are
a. Schema definition:
DBA creates the original database schema by writing definitions, which are stored
permanently in the data dictionary.
b. Storage structure and access method definition
13
DBA creates appropriate storage structures and access methods by writing set of
definitions.
c. Schema and physical-organization modification
DBA is involved in the rare modifications to the database schema or to the description of
physical organization.
d. Granting of authorization for data access.
DBA allows granting of various types of authorizations to various users.
e. Routine maintenance.
Periodically backing up the database
Ensuring that enough free disk space is available for normal operations, and upgrading
disk space as required.
Monitoring jobs running on the database
f. Integrity constraints
DBA specifies the constraints, which are checked before any data addition, modification
etc.
g. Recovery
DBA is also responsible for recovery of database from failures.
h. Overall custodian
DBA is the overall custodian and controls the database.
DATABASE LANGUAGES
DBMS provides different types of languages. They are:
14
A data dictionary contains metadata—that is, data about data. The schema of a table is an example of
metadata.
The storage structure and the access methods used by the DBMS are specified by a set of definitions in a
special type of DDL called Data Storage and Definition language. The compiled forms of these
specify the implementation details of the database schemas, which are usually hidden from the users.
The data values stored in the database must satisfy certain consistency constraints.(For example,
suppose the balance on an account should not fall below $100). The database systems check these
constraints every time the database is updated.
E. F CODD RULES
Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems, came up with
twelve rules of his own, which according to him, a database must obey in order to be regarded as a true
relational database.
Rule 0 is a foundation rule, which acts as a base for all the other rules.
Rule 0 : Foundation Rule
A relational database management system must manage its stored data using only its relational capabilities.
For example, if two tables are merged or one is split into two different tables, there should be no impact or
change on the user application. This is one of the most difficult rule to apply.
18