Uni I
Uni I
Uni I
INTRODUCTION
DEFINITION:
Database-Management system (DBMS) consists of a collection of interrelated data
and set of programs to access those data.
The collection of data referred to as the database.
The primary goal of a DBMS is to provide an environment that is both
convenient and efficient to use in retrieving and storing database information.
Logical level:
The next higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data.
The entire database is thus described in terms of a small number of relatively
simple structures.
The logical level of abstraction is used by database administrator, who must
decide what information is to be kept in the database.
View level:
The highest level of abstraction describes only part of the entire database.
The users need to access only a part of the database. So that their interaction
with the system is simplified, the view level of abstraction is defined.
The system may provide many views for the same database.
EX: A Pascal like language, to declare a record as follows.
This code defines a new record called customer with four fields. Each field has a name
and a type associated with it.
At the physical level, customer, account or employee record can be described
as a block of consecutive storage locations.
At the logical level, each such record is described by a type definition.
At the view level, computer users see a set of application programs the hide
details of the data types.
Instance and Schemas:
The collection of information stored in the database of a particular moment is
called an instance of the database.
The overall design of the database is called the database schema.
To declare such variables in a Pascal like language.
Var customer 1: customer:
Variable customer 1 now corresponds to an area of storage containing a customer
type record.
Database systems have several schemas, partitioned into levels of abstraction.
Physical Schema: At the lowest level.
Logical Schema: At the intermediate level.
Sub Schema: At the highest level.
Database system support one physical schema, one logical schema and several
sub schemas.
Data Independence:
The ability of modify a schema definition in one level with out affecting a schema
definition in the next higher level is called data independence.
There are 2 levels of data independence.
Physical data independence:
It is ability to modify the physical schema with out causing application programs
to be rewritten.
The modification at the physical level is necessary to improve performance.
Logical data independence:
It is ability to modify the logical schema with out causing application programs
to be rewritten.
Modifications at the logical level are necessary when ever the logical structure of
the database is altered.
Logical data independence is more difficult to achieve than is physical data
independence. Since application programs are heavily dependent and logical
structure of the data.
Domain Constraints
Referential Integrity
Assertion
Authorization
Structure System:
A database system is partitioned into modules that deal with each of the
responsibilities of the overall system
The functional components of a database system can be broadly divided into
query processor components and storage manager components.
The query processor components include:
DML Compiler:
DML compiler which translates DML statements in a query language into low-
level instructions that the query evaluation engine understands.
Embedded DML Precompiler:
DML Precompiler which converts DML Statements embedded in an application
program to normal procedure calls in the host language.
DDL Interpreter:
DDL interpreter which interprets DDL Statements and records them in asset of
tables containing metadata
Query evaluation engine:
In which executes low-level instructions generated by the DML compiler
The storage manager components include:
Authorization and integrity manager:
In which tests for the satisfaction of integrity constraints and check the authority of
users to access data.
Transaction manager
In which ensures that the database remains in a consistent (correct) state despite
system failures, and that concurrent transaction executions proceed without conflicting
File manager:
In which manages the allocation of space on disk storage and the data structures used
to represent information stored on disk.
Buffer manager:
In which is responsible for fetching data from disk storage into main memory,
and deciding what data to cache in memory
Several data structures are required as physical system implementation
Data files
In which store the database itself
Data dictionary
In which stores metadata about the structure of the database the data dictionary
is used heavily
Indices
In which provides fast access to data items that hold particular values.
Statistical data
In which store statistical information about the data in the database.
Database applications are usually partitioned into two or three parts
user user
client
application Application
client
network network
Application
Database system serve
Database system
1.7. DATABASE USERS AND ADMINISTRATORS
Basic structure:
The account table has 3 column headers branch-name, account-number and
balance.
The terminology of the relational model has headers as attributes each attribute
is a set of permitted values called the domain of that attribute.
(Ex) attribute branch-number the domain is the set of all branch-names.
D1 denote this set D2 denote the set of all account-number and D3 the set of all
balance.
The row of account must consists of a 3 tuple(V1,V2,V3)
V1-is a branch-name
V2-is an account-number
V3-is a balance
The account is a subset of D1*D2*D3
A table of n attributes must be subset of
D1*D2*D3*D4……………..Dn-1*Dn
The mathematical terms relation and tuple in place of the terms tables and
rows.
In account relation there are several tuples the tuple variable t refer to the first
tuple of the relation
t[branch-name]-to denote the value of t on the branch-name attribute
t[branch-name] = “Brighton”
t[balance]=500
t[1]-to denote the value of tuples on the first attribute(branch-name)
t[2]-denote account-number ……..so on
Database schema:
o database schema, or the logical design of the database and a database instance,
which is a snapshot of the data in the database at a given instant in time
o to give a name to a relation schema the lowercase names for relations and
names beginning with and uppercase letter for relation schemas
o In this notation, the Account-schema to denote the relation schema for relation
account.
o Account-schema=(branch-name, account-number, balance)
o The account is a relation on Account-Schema by Account(Account-schema)
o A relation schema comprises a list of attributes and their corresponding
domains
o Consider the branch relation
o Branch-schema = (branch-name, branch-city, asset)
o Note that the attribute branch-name appears in both branch-schema and
Account-schema
Keys:
There are 3 types of keys
Super keys
Primary keys
Candidate Keys
Candidate keys
o In branch-schema,{branch-name} and {branch-name,branch-city} are both super
keys {branch-name,branch-city} is not a candidate keys and {branch-name}
itself is a super key {branch-name} is a candidate keys our purpose serve as a
primary key
o The attribute branch-city is not a super key.since 2 branches in the same city
may have different names
o Let R be a relation schema the subset K of R is a super keys for R restricting
consideration to relations r® in which no 2 distinct tuples have the same values
on all attributes in K that is if t1 and t2 are in r and t1 != t2 then t1[k]!=t2[k].
Strong entity set:
The primary key of the entity set becomes the primary key of the relation
When designing a database, you have to make decisions regarding how best to
take some system in the real world and model it in a database. This consists of deciding
which tables to create, what columns they will contain, as well as the relationships
between the tables. While it would be nice if this process was totally intuitive and
obvious, or even better automated, this is simply not the case. A well-designed database
takes time and effort to conceive, build and refine.
The benefits of a database that has been designed according to the relational model are
numerous. Some of them are:
Data entry, updates and deletions will be efficient.
Data retrieval, summarization and reporting will also be efficient.
Since the database follows a well-formulated model, it behaves predictably.
Since much of the information is stored in the database rather than in the
application, the database is somewhat self-documenting.
Changes to the database schema are easy to make.
The goal of this article is to explain the basic principles behind relational database
design and demonstrate how to apply these principles when designing a database using
Microsoft Access. This article is by no means comprehensive and certainly not
definitive. Many books have been written on database design theory; in fact, many
careers have been devoted to its study. Instead, this article is meant as an informal
introduction to database design theory for the database developer.
1.10 ER - MODEL
The E-R (entity-relationship) data model views the real world as a set of basic
objects (entities) and relationships among these objects.
It is intended primarily for the DB design process by allowing the specification of
an enterprise scheme. This represents the overall logical structure of the DB.
An entity is a “thing” or “object “in the real world that is distinguishable from all
other objects.
o (Ex) each person in an enterprise is an entity social-security number 1001
uniquely identifies one particular person in the enterprise.
o An entity set is a set of entities of the same type that share the same properties
or attributes
o Entity sets do not need to be disjoint. An entity is represented by a set of
attributes.
o Attributes are descriptive properties possessed by each member of an entity
set.
o A database include a collection of entity sets each of which contains any
number of entities of the same type
o An attribute as used in the E-R Model can be characterized by the following
attribute types.
Entity sets:
o A super key is a set of one or more attributes that taken collectively allows us
to identify uniquely an entity in the entity set.
Constraints:
An E-R enterprise schema may define certain constraints to' which the contents
of a database must conform there are two of the most important types of constraints,
1. Mapping Cardinalities
Mapping cardinalities, or cardinality ratios, express the number of entities to
which another entity can be associated via a relationship set.
For a binary relationship set R between entity sets A and B, the mapping
cardinality must be one of the following:
One to one:
An entity in A is associated with at most one entity in B, and an entity in B is
associated with at most one entity in A.
One to many:
An entity in A is associated with any number (zero or more) of
entities in B. An entity in B, however, can be associated with at most one
entity in A
Many to one:
An entity in A is associated with at most one entity in B. An entity in B, however,
can be associated with any number (zero or more) of entities.
Many to many:
An entity in A is associated with any number (zero or more) of entities in B, and
an entity in B is associated with any number (zero or more) of entities in A.
2. Participation Constraints
The participation of an entity set E in a relationship set R is said to be total if every
entity in E participates in at least one relationship in R. If only some entities in E
participate in relationships in R, the participation of entity set E in relationship R is
said to be partial.
Keys:
A key allows us to identify a set of attributes that suffice to distinguish entities
from each other. Keys also help uniquely identify relationships, and thus distinguish
relationships from each other.
Entity Sets:
A super key is a set of one or more attributes that, taken collectively, allow us to
identify uniquely an entity in the entity set
For example, the customer-id attribute of the entity set customer is sufficient to
distinguish one customer entity from another. Thus, customer-id is a super key.
Similarly, the combination of customer-name and customer-id is a super key for the
entity set customer. The customer-name attribute of customer is not a super key,
because several people might have the same name. If K is a super key, then so is any
superset of K.
We are often interested in super keys for which no proper subset is a super key.
Such minimal super keys are called candidate keys.
We shall use the term primary key to denote a candidate key that is chosen by the
database designer as the principal means of identifying entities within an entity set. A
key (primary, candidate, and super) is a property of the entity set, rather than of the
individual entities.
Relationship Sets:
Let R be a relationship set involving entity sets e|, E,,..., E.. Let primary-key ( e|)
denote the set of attributes that forms the primary key for entity set E,. Assume for
now that the attribute names of all primary keys are unique, and each entity set
participates only once in the relationship. The composition of the primary key for a
relationship set depends on the set of attributes associated with the relationship set R.
If the relationship set R has no attributes associated with it, then the set
of attributes
o The database designer may have first identified a checking-account entity set
with the attributes account-number, balance and overdraft-amount
and savings-account entity set with the attributes acc-number, balance and
interest-rate.
o There are similarities between the checking-account entity set and the
savings-account entity set in the sense that they have several attributes in common
o This commonality can be expressed by generalization, which is a containment
relationship that exists between a higher-level entity set and one or more lower-
level entity set
o (Ex) account is the higher-level entity set and savings-account and checking-
account are lower-level entity sets.
o Higher and lower-level entity sets also may be designated by the terms pf
super class and subclass
o The account entity set is the super class of the savings-account and checking-
account are subclasses
o Generalization is a simple inversion of specialization