WWW Jbiet Edu in
WWW Jbiet Edu in
UNIT-I
Data:
It is a collection of information.
The facts that can be recorded and which have implicit meaning known as 'data'.
Example:
Customer ----- 1.cname.
2.cno.
3.ccity.
Database:
Database System:
It is computerized system, whose overall purpose is to maintain the information and to make that
the information is available on demand.
Advantages:
2.Inconsistency can be
avoided.
It is a collection of programs that enables user to create and maintain a database. In other words it is general-purpose
software that provides the users with the processes of defining, constructing and manipulating the database for
various applications.
Advantages of DBMS:
1.Data Independence.
2.Efficient Data Access.
3.Data Integrity and security.
4.Data administration.
5.Concurrent access and Crash recovery.
6.Reduced Application Development Time.
Applications
Database Applications:
Banking: all transactions
Airlines: reservations, schedules
Universities: registration, grades
Sales: customers, products, purchases
Online retailers: order tracking, customized recommendations
Manufacturing: production, inventory, orders, supply chain
Human resources: employee records, salaries, tax deductions
Many persons are involved in the design, use and maintenance of any database. These persons can be
classified into 2 types as below.
e. Database Tuning:
The DBA is responsible for modifying the database to ensure adequate
Performance as requirements change.
2.Database Designers:
Database designers are responsible for identifying the data to be stored in the database and for choosing appropriate
structures to represent and store this data.
3. End Users:
People who wish to store and use data in a database.
End users are the people whose jobs require access to the database for querying, updating and generating
reports, listed as below.
4.System Analyst:
These people determine the requirements of end users and develop specifications for transactions.
Department of CSE,JBIET
4
b. Workers behind the scene:
Database Designers and Implementers:
These people who design and implement the DBMS modules and interfaces as a software package.
2.Tool Developers:
Include persons who design and implement tools consisting the packages for design, performance
monitoring, and prototyping and test data generation.
1.Physical Level:
This is a lowest level, which describes how the data is actually
stores. Example:
Customer account database can be described.
2.Logical Level:
This is next higher level that describes what data and what relationships in the
database. Example:
Each record
type customer = record
cust_name: sting;
cust_city: string;
cust_street: string;
end;
Department of CSE,JBIET
5
1. Relational model.
2. Network model.
3. Hierarchal model.
3. Physical Models:
These models can be used in describing the data at the lowest level, i.e. physical
level. These models can be classified into
1. Unifying model
2. Frame memory model.
UNIT-2
The E-R model can be used to describe the data involved in a real world enterprise in terms of
objects and their relationships.
Uses:
These models can be used in database design.
It provides useful concepts that allow us to move from an informal
description to precise description.
This model was developed to facilitate database design by allowing the specification of overall logical structure
of a database.
It is extremely useful in mapping the meanings and interactions of real world enterprises onto a conceptual
schema.
These models can be used for the conceptual design of
database applications.
6
'Design the logical and physical structure of 1 or more databases to accommodate the information needs of
the users in an organization for a defined set of applications'.
The goals database designs are as below.
1.Satisfy the information content requirements of the specified users
and applications.
2.Provide a natural and easy to understand structuring of the
information.
3.Support processing requirements and any performance objectives
such as 'response time, processing time, storage space etc..
1.Expressiveness:
The data model should be expressive to distinguish different types of data, relationships and
constraints.
3.Minimality:
The model should have small number of basic
concepts. 4.Diagrammatic Representation:
The model should have a diagrammatic notation for displaying the conceptual schema.
5.Formality:
A conceptual schema expressed in the data model must represent a formal specification of the data.
Example:
Cust_name : string;
Cust_no : integer;
Cust_city : string;
Under this, we must choose a DBMS to implement our database design and convert the conceptual database
design into a database schema.
The choice of DBMS is governed by number of factors as
below. 1.Economic Factors.
2.Organizational Factors.
Explanation is as below.
Department of CSE,JBIET
7
1.Economic Factors:
b. Maintenance Cost:
This is the cost of receiving standard maintenance service from the vendor and for keeping the DBMS
version up to date.
g. Operating Cost:
The cost of continued operation of the database system.
2.Organizational Factors:
These factors support the organization of the vendor, can be listed as
below. a. Data Complexity:
Need of a DBMS.
b. Sharing among applications:
The greater the sharing among applications, the more the redundancy among files and hence the greater the
need for a DBMS.
c. Dynamically evolving or growing data:
If the data changes constantly, it is easier to cope with these changes using a DBMS than using a file
system.
d. Frequency of ad hoc requests for data:
File systems are not suitable for ad hoc retrieval of
data. e. Data Volume and Need for Control:
These 2 factors needs for a DBMS.
Example:
Customer database can be represented in the form of tables or diagrams.
3. Schema Refinement:
Under this, we have to analyze the collection of relations in our relational database schema to identify the
potential problems.
2.ENTITIES
1. It is a collection of objects.
2. An entity is an object that is distinguishable from other objects by a set of attributes.
3. This is the basic object of E-R Model, which is a 'thing' in the real world with an independent existence.
4. An entity may be an 'object' with a physical existence.
5. Entities can be represented by 'Ellipses'.
Example:
i. Customer, account etc.
3. ATTRIBUTES
Characteristics of an entity are called as an attribute.
The properties of a particular entity are called as attributes of that specified entity.
Example:
Name, street_address, city --- customer database.
Acc-no, balance --- account database.
Types:
These can be classified into following types.
1.Simple Attributes.
2.Composite Attributes.
3.Single Valued Attributes.
4.Mutivalued Attributes.
5.Stored Attributes.
6.Derived Attributes.
Explanation is as below.
1.Simple Attributes:
The attributes that are not divisible are called as 'simple or atomic attributes'.
Example:
cust_name, acc_no etc..
2.Composite Attributes:
The attributes that can be divided into smaller subparts, which represent more basic attributes with
independent meaning.
These are useful to model situations in which a user sometimes refers to the composite attribute as unit but at
other times refers specifically to its components.
Example:
Street_address can be divided into 3 simple attributes as Number, Street and Apartment_no.
Street_address
5.Derived Attributes:
An attribute which is derived from another attribute is called as a ‘derived attribute.
Example:
‘Age’ attribute is derived from another attribute ‘Date’.
6.Stored Attribute:
An attribute which is not derived from another attribute is called as a ‘stored attribute.
Example:
In the above example,’ Date’ is a stored attribute.
4. ENTITY SETS
Entity Type:
A collection entities that have the same attributes is called as an 'entity type'.
Each entity type is described by its name and attributes.
Entity Set:
Collection of all entities of a particular entity type in the database at any point of time is called as an entity
set.
The entity set is usually referred to using the same name as the entity type.
An entity type is represented in ER diagrams as a rectangular box enclosing the entity type name.
Example:
Collection of customers.
5. Relationships
6. Relationship Sets
It is a collection of relationships.
Primary Key:
The attribute, which can be used to identify the specified information from the tables.
Weak Entity:
A weak entity can be identified uniquely by considering some of its attributes in conjunction with the primary key of
another entity.
The symbols that can be used in this model are as follows.
EXAMPLE:
Cust_acc
Customer Account
Descriptive Attributes:
A relationship can also have some attributes, which are called as ‘descriptive
attributes’. These are used to record information about the relationship.
Example:
James of ‘Employees’ entity set works in a department since 1991.
Dno
Name Street Since Dname
City
Budget
Ternary Relationship:
1.Key Constraints:
These can be classified into 4 types as below.
1.Many to Many:
An employee is allowed to work in different departments and a department is allowed to have
several employees.
Dno
Name Street Since Dname
City
Budget
Dno
Name Street Since Dname
City
Budget
3.Many to One:
Each employee works in at most 1 department.i.e, many employees can work in same department.
Dno
Name Street Since Dname
City
Budget
The participation constraint specifies whether the existence of an entity depends on its being related to
another entity via the relationship type.
A department has at most one manager. This requirement is an example of participation
constraints. There are 2 types of participation constraints, which are as below.
1.Total.
2.Partial.
Explanation is as below.
1.Total:
An entity set dependent on a relationship set and having
one to many relationships is said to be ‘total’.
The participation of the entity set ‘departments’ in the
relationship set ‘manages’ is said to be total.
2.Partial:
A participation that is not total is said to be partial.
Example:
Participation of the entity set ‘employees’ in ‘manages’ is partial, since not every employee gets to
manage a department.
In E-R diagram, the total participation is displayed as a ‘double line’ connecting the participating entity type
to the relationship, where as partial participation is represented by a single line.
If the participation of an entity set in a relationship set is total, then a thick line connects the
two. The presence of an arrow indicates a key constraint.
No Dname Budget
Name
Partial Participation
A weak entity set can be identified uniquely only by considering some of its attributes in conjunction
with the primary key of another entity (Identifying owner).
For any weak entity set, following restrictions must hold.
a. The owner entity set and the weak entity set must participate in
a One-to-many relationship set, which is called as the
‘Identifying Relationship Set’ of the weak entity set.
b. The weak entity set must have total participation in the identifying relationship set.
Department of CSE,JBIET
13
Example:
The set of attributes of a weak entity set that uniquely identify a weak entity for a given owner entity
is called as ‘partial key of the weak entity set’.
Example:
The dependent weak entity set and its relationship to employees is shown in the following
diagram. Linking them with a dark line indicates the total participation of dependents in policy.
To understand the fact that dependents is a weak entity and policy is its identifying relationship, we
draw both with dark lines.
To indicate that ‘pname’ is a partial key for dependents, we underline it using a broken line.
4.Aggregation:
Aggregation is an abstraction for building composite objects from their component objects.
Aggregation is used to represent a relationship between a whole object and its component parts.
Aggregation allows us to indicate that a relationship set (identified through a dashed box) participates
in another relationship set.
This is illustrated with a dashed box around sponsors.
If we need to express a relationship among relationships, then we should use aggregation.
Aggregation versus Ternary Relationship:
We can use either aggregation or ternary relationship for 3 or more entity
sets. The choice is mainly determined by
a. The existence of a relationship that relates a relationship set to an
entity set or second relationship set.
b. The choice may also guided by certain integrity constraints that
we want to express.
Department of CSE,JBIET
14
Name Employees No
Monitors Until
---------------------------------------------------------------------------------------------------------------------
Departments
Sponsors
Projects
----------------------------------------------------------------------------------------------------------------------
The information gathered in the requirements analysis step is used to develop a higher-level description of
the data.
The goal of conceptual database design is a complete understanding of the database structure, meaning
(semantics), inter-relationships and constraints.
Characteristics of this phase are as below.
1.Expressiveness:
3.Minimality:
4.Diagrammatic Representation:
Department of CSE,JBIET
15
The model should have a diagrammatic notation
for displaying the conceptual schema.
5.Formality:
Example:
Cust_name: string;
Cust_no: integer;
Cust_city: string;
Suppose that each department manager is given a ‘Dbudget’ as shown in the figure.
Dno
Name Street Since Dname
City
Budget
There is at most 1 employee managing a department, but a given employee could manage
several departments (1 to many relationships).
We can store starting date and ‘Dbudget’ for each manager-department pair.
This approach is natural, if we assume that a manager receives a single ‘Dbudget’ for each department
that he manages. But if the ‘Dbudget’ is the sum of all departments, then ‘manages’ relationship that
involves each employee will have the same value (total value).
So this leads to redundancy.
This can be solved by the appointment of the employee as a manager of a group of departments.
We can model ‘mgr_appt’ as an entity set for manager appointment, use a ternary relationship and we
can have at most 1 manager for each department due to 1 to many relationship.
Department of CSE,JBIET
16
Dno
Name Street Since Dname
City
Budget
Mgr_appt
Since
Dbudget
The process of conceptual database design consists describing small fragments of the application in terms of
E-R diagrams.
For a large Enterprise, the design may
require, 1.More than 1 designer.
2.Span data and application by a number of user groups.
Using a high level semantic data model such as ER diagrams for conceptual design offers the additional
advantages that,
1.The high level design can be diagrammatically represented.
2.Many people, who provide the input to the design process, easily understand it.
An alternative approach is to develop separate conceptual schemas for different user groups and then
integrate all those.
To integrate, we must establish correspondences between entities, relationships and attributes, so that this
process is somewhat difficult.
The relations of degree 1 are called as ‘Unary Relations’.
The relations of degree 2 are called as ‘Binary Relations’.
The relations of degree 3 are called as ‘Ternary Relations’.
The relations of degree n are called as ‘nary Relations’.
Department of CSE,JBIET
17
UNIT-3
RELATIONAL MODEL
A database is a collection of 1 or more ‘relations’, where each relation is a table with rows and columns.
This is the primary data model for commercial data processing applications.
The major advantages of the relational model over the older data models are,
1.It is simple and elegant.
2.simple data representation.
3.The ease with which even complex queries can be expressed.
Introduction:
The main construct for representing data in the relational model is a ‘relation’.
A relation consists of
1.Relation Schema.
2.Relation Instance.
Explanation is as below.
1.Relation Schema:
The relation schema describes the column heads for the table.
The schema specifies the relation’s name, the name of each field (column, attribute) and the ‘domain’
of each field.
A domain is referred to in a relation schema by the domain name and has a set of associated
values. Example:
Student information in a university database to illustrate the parts of a relation schema.
Students (Sid: string, name: string, login: string, age: integer, gross: real)
This says that the field named ‘sid’ has a domain named ‘string’.
The set of values associated with domain ‘string’ is the set of all character strings.
2.Relation Instance:
Example:
Fields (Attributes, Columns)
This example is an instance of the students relation, which consists 4 tuples and 5 fields. No two rows are
identical.
Department of CSE,JBIET
18
Degree:
The number of fields is called as ‘degree’.
This is also called as ‘arity’.
Cardinality:
The cardinality of a relation instance is the number of tuples in
it. Example:
In the above example, the degree of the relation is 5 and the cardinality is 4.
Relational database:
It is a collection of relations with distinct relation
names. Relational database schema:
It is the collection of schemas for the relations in the
database. Instance:
An instance of a relational database is a collection of relation instances, one per relation schema in the
database schema.
Each relation instance must satisfy the domain constraints in its schema.
An integrity constraint (IC) is a condition that is specified on a database schema and restricts the data can be
stored in an instance of the database.
Various restrictions on data that can be specified on a relational database schema in the form of ‘constraints’.
A DBMS enforces integrity constraints, in that it permits only legal instances to be stored in the database.
Integrity constraints are specified and enforced at different times as below.
Legal Instance:
If the database instance satisfies all the integrity constraints specified on the database
schema. The constraints can be classified into 4 types as below.
1.Domain Constraints.
2.Key Constraints.
3.Entity Integrity Constraints.
4.Referential Integrity Constraints.
Explanation is as below.
1.Domain Constraints
Domain constraints are the most elementary form of integrity constraints. They are tested easily by the system
whenever a new data item is entered into the database.
Domain constraints specify the set of possible values that may be associated with an attribute. Such
constraints may also prohibit the use of null values for particular attributes.
The data types associated with domains typically include standard numeric data types for integers
A relation schema specifies the domain of each field or column in the relation instance.
These domain constraints in the schema specify an important condition that each instance of the relation to
satisfy: The values that appear in a column must be drawn from the domain associated with that column.
Thus the domain of a field is essentially the type of that field.
2.Key Constraints
1.Explain the concept of Super Key, Candidate Key and Primary Key with examples?(6 Marks, Feb-2004)
A key constraint is a statement that a certain minimal subset of the fields of a relation is a unique identifier for
a tuple.
Example:
The ‘students’ relation and the constraint that no 2 students have tha same student id
(sid). These can be classified into 3 types as below.
Department of CSE,JBIET
19
a. Candidate Key or Key.
b. Super Key.
c. Primary Key.
Explanation is as below.
a. Candidate Key or Key:
1.Explain ‘Candidate Key’?(4 Marks, Semptember-2003)
A set of fields that uniquely identifies a tuple according to a key constraint is called as a ‘Candidate Key’ for
the relation.
This is also called as a ‘key’.
From the definition of candidate key, we have,
1.Two distinct tuples in a legal instance cannot have identical values
in all the fields of a key.i.e, in any legal instance, the values in the key
fields uniquely identify a tuple in the instance.
i.e,the values in the key fields uniquely identify a tuple in the instance. 2.
No subset of the set of fields in key is a unique identifier for a tuple,
i.e., the set of fields {sid, name} is not a key for
Students. A relation schema may have more than key.
Example: In the above Students relation, the ‘sid’ field is a candidate key.
{sid}.
The value of a key attribute can be used to identify uniquely each tuple in the relation.
‘A set of attributes constituting a key’ is a property of the relation schema.
A key is determined from the meaning of attributes.
Every relation is guaranteed to have a key. Since a relation is a set of tuples, the set of all fields is always a
super key.
b. Super Key:
The set of fields that contains a key is called as a ‘super key’.
The set of 1 or more attributes that allows us to identify uniquely an entity in the entity set.
A super key specifies a uniqueness constraint that no 2 distinct tuples can have the same
value. Every relation has at least 1 default super key as the set of all attributes.
Example:
Students
(Relation) Name (Fields)
Login
Age
Gross
One of the super key = {Sid, Name, Login, Gross}
c. Primary Key:
This is also a candidate key, whose values are used to identify tuples in the relation. It is
common to designate one of the candidate keys as a primary key of the relation. The
attributes that form the primary key of a relation schema are underlined.
It is used to denote a candidate key that is chosen by the database designer as
the principal means of identifying entities with an entity set.
Example:
‘Sid’ of Students relation.
In SQL, we are declaring the set of fields of a table consisting a key by using
‘UNIQUE’ constraint.
This ‘UNIQUE’ constraint specifies that 2 distinct tuples cannot have identical
Values.
Candidate keys can be declared as a ‘primary key’ using the constraint
‘PRIMARY KEY’.
Department of CSE,JBIET
20
We can name a constraint by using the syntax as below.
CREATE TABLE Students (sid CHAR (20), name CHAR (30), login CHAR(20),
age INTEGER, gross REAL, UNIQUE (name, age),
` CONSTRAINT sid1 PRIMARY KEY (sid));