DBMS Unit-I Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

UNIT-I

Database System Applications: A Historical Perspective, File Systems versus a DBMS, the Data

Model, Levels of Abstraction in a DBMS, Data Independence, Structure of a DBMS

Introduction to Database Design: Database Design and ER Diagrams, Entities, Attributes, and
Entity Sets, Relationships and Relationship Sets, Additional Features of the ER Model,
Conceptual Design With the ER Model

Data:

Data is the known facts or figures that have implicit meaning. It can also be defined as it is the
representation of facts, concepts or instruction in a formal manner, which is suitable for
understanding and processing. Data can be represented in alphabets (A-Z, a-z),in digits(0-9) and
using special characters(+,-.#,$, etc) e. g: 25, “ ajit ” etc.

Information:

Information is the processed data on which decisions and actions are based. Information can be
defined as the organized and classified data to provide meaningful values. Eg: “The age of Ravi
is 25”

Eg: “The age of Ravi is 25”

DBMS process the data stored in the files into information.

File: File is a collection of related data stored in secondary memory.

File oriented approach:

The traditional file oriented approach to information processing has for each application a
separate master file and its own set of personal file. In file oriented approach the program
dependent on the files and files become dependent on the files and files become dependents upon
the programs.

Disadvantages of file oriented approach:

1) Data redundancy and inconsistency: The same information may be written in several files.
This redundancy leads to higher storage and access cost. It may lead data inconsistency that is
the various copies of the same data may longer agree for example a changed customer address
may be reflected in single file but not elsewhere in the system.

2) Difficulty in accessing data: The conventional file processing system does not allow data to
retrieve in a convenient and efficient manner according to user choice.
3) Data isolation: Because data are scattered in various file and files may be in different formats
with new application programs to retrieve the appropriate data is difficult.

4) Integrity Problems: Developers enforce data validation in the system by adding appropriate
code in the various application programs. However when new constraints are added, it is difficult
to change the programs to enforce them.

5) Atomicity: It is difficult to ensure atomicity in a file processing system when transaction


failure occurs due to power failure, networking problems etc. (atomicity: either all operations of
the transaction are reflected properly in the database or non are)

6) Concurrent access: In the file processing system it is not possible to access a same file for
transaction at same time.

7) Security problems: There is no security provided in file processing system to secure the data
from unauthorized user access.

Database:

It is a collection of interrelated data. These can be stored in the form of tables. A database can be
of any size and varying complexity.

Example: Customer database consists the fields as cname, cno, and ccity

Cno Cname Ccity


1 Hari Hyd
2 Sure Ban
Fig: Customer database

Database Management System (DBMS): A database management system consists of collection


of related data and refers to a set of programs for defining, creation, maintenance and
manipulation of a database.

Popular DBMS Software

 My SQL
 Oracle
 IBM DB2
 SQLite
Advantages of DBMS:

Reduction of redundancies: Centralized control of data by the DBA avoids unnecessary


duplication of data and effectively reduces the total amount of data storage required avoiding
duplication in the elimination of the inconsistencies that tend to be present in redundant data
files.

Concurrent Access: A DBMS schedules concurrent access to the data in such manner that users
can think of the data being accessed by only one user at a time.

Ex: Railway reservation

Data Security: The DBA who has the ultimate responsibility for the data in the DBMS can
ensure that proper access procedures are followed including proper authentication schemas for
access to the DBS and additional check before permitting access to sensitive data.

Backup and Recovery: It provides backup and recovery subsystems which create automatic
backup of data, from hardware and software failures and restores the data if required.

Data Independence: Ina database system, the database management system provides the
interface between the application programs and the data. When changes are made to the data
representation, the meta data obtained by the DBMS is changed but the DBMS is continues to
provide the data to application program in the previously used way. The DBMs handles the task
of transformation of data wherever necessary.

Applications of DBMS:

Library Management System: There are thousands of books in the library so it is very difficult
to keep record of all the books in a copy or register. So DBMS used to maintain all the
information relate to book issue dates, name of the book, author and availability of the book.
Banking: For customer information, account activities, payments, deposits, loans, etc.
Airlines: For reservations and schedule information.
Universities: For to store the information about all instructors, students, departments, and course
offerings, colleges, grades.
Telecommunication: It helps to keep call records, monthly bills, maintaining balances, etc.
Manufacturing: It is used for the management of supply chain and for tracking production of
items.
For example distribution centre should keep a track of the product units that supplied into the
centre as well as the products that got delivered out from the distribution centre on each day.
HR Management: For information about employees, salaries, payroll, deduction, generation of
paychecks, etc.
Online Shopping:
Online shopping has become a big trend of these days. No one wants to go to shops and waste
his time. Everyone wants to shop from home. So all these products are added and sold only with
the help of DBMS. Purchase information, invoice bills and payment, all of these are done with
the help of DBMS.

History of DBMS

1960’s:
From the earliest days of computers storing and manipulating data have been a major application
focus. The first general purpose DBMS was designed by Charles Bachman at General Electric
in the early 1960s, called Integrated Data Store. It formed the basis for network data model,
which was standardized by the Conference on Data Systems Languages(CODASYL) type of
DBMS, the hierarchical DBMS.

In the late 1960’s IBM developed the Information Management System(IMS) DBMS, used even
today in many major illustrations. IMS formed as basis for an alternative data representation
framework called the hierarchical data model. The SABRE system for making airline
reservations was jointly developed by American Airlines and IBM at the same time and allows
the people to access the same data through computer network.

In 1970’s, Edger codd, at IBM, proposed a new data representation frame work called the
relational model: It sparked the rapid development of several DBMS’s based on the relational
model.

In 1980’s, the relational model consolidated this position as the dominant DBMS paradigm and
database systems continued to gain widespread use. The SQL query language for relational
databases developed at IBM. SQL was standardized in the late 1980’s and current standard
SQL: 1999, was adopted by ANSI and ISO.

In the late1980’s and 1990’s advances were made in many areas of database systems. Several
vendors (DB2, Oracle 8) extended their systems with ability to store new data types such as
images and text. Specialized systems have been developed by numerous for creating data
warehouses, consolidating data from several databases, and for carrying specialized analysis.

An interesting phenomenon is emergence of several enterprise resource planning (ERP) and


Management resource planning (MRP) packages (Oracle, SAP, People Soft, Siebel) add a
substantial layer of application-oriented features on top of a DBMS.

The Internet Age has perhaps influenced the data models much more. Data models were developed
using object oriented programming features, embedding with scripting languages like Hyper Text
Markup Language (HTML) for queries. With humongous data being available online, DBMS is
gaining more importance as more data is brought online and made over more accessible through
computer networking.

In the early years of computing, punch card was used in unit record machines for input, data
storage and processing this data. Data was entered offline and for both data, and computer
programs input. This input method is similar to voting machines now a days. This was the only
method, where it was fast to enter data, and retrieve it, but not to manipulate or edit it.
After that era, there was the introduction of the file type entries for data, then the DBMS as
hierarchical, network, and relational.

Components of DBMS
The database management system can be divided into five major components, they are:
1. Hardware
2. Software
3. Data
1. Procedures
2. Database Access Language

Fig: Components of DBMS

Hardware:
When we say Hardware, we mean computer, hard disks, I/O channels for data, and any other
physical component involved before any data is successfully stored into the memory.
When we run Oracle or MySQL on our personal computer, then our computer's Hard Disk, our
Keyboard using which we type in all the commands, our computer's RAM, ROM all become a
part of the DBMS hardware.
Software
This is the main component, as this is the program which controls everything. The DBMS
software is more like a wrapper around the physical database, which provides us with an easy-to-
use interface to store, access and update data.
The DBMS software is capable of understanding the Database Access Language and intrepret it
into actual database commands to execute them on the DB.

Data:
Data is that resource, for which DBMS was designed. The motive behind the creation of DBMS
was to store and utilize data.
In a typical Database, the user saved Data is present and meta data is stored.
Metadata is data about the data. This is information stored by the DBMS to better understand
the data stored in it.
For example: When I store my Name in a database, the DBMS will store when the name was
stored in the database, what is the size of the name, is it stored as related data to some other data,
or is it independent, all this information is metadata.
Procedures:
Procedures refer to general instructions to use a database management system. This includes
procedures to setup and install a DBMS, To login and logout of DBMS software, to manage
databases, to take backups, generating reports etc.
Database Access Language
Database Access Language is a simple language designed to write commands to access, insert,
update and delete data stored in any database.
A user can write commands in the Database Access Language and submit it to the DBMS for
execution, which is then translated and executed by the DBMS, can create new databases, tables,
insert data, fetch stored data, update data and delete the data using the access language.
Users

Database Administrators: Database Administrator or DBA is the one who manages the
complete database management system. DBA takes care of the security of the DBMS, it's
availability, managing the license keys, managing user accounts and access etc.

Application Programmer or Software Developer: This user group is involved in developing


and designing the parts of DBMS.

End User: These days all the modern applications, web or mobile, store user data. How do you
think they do it? Yes, applications are programmed in such a way that they collect user data and
store the data on DBMS systems running on their server. End users are the one who store,
retrieve, update and delete data.
Data Model:
Data Model is the modeling of the data description, data semantics, and consistency constraints
of the data. It provides the conceptual tools for describing the design of a database at each level
of data abstraction. Data models are used for to describe the structure of the database.

Hierarchical Model

In a Hierarchical database, model data is organized in a tree-like structure. The hierarchy starts
from the Root data, and expands like a tree, adding child nodes to the parent nodes. Data is
represented using a parent-child relationship. In Hierarchical DBMS parent may have many
children, but children have only one parent.

This model efficiently describes many real-world relationships like index of a book, recipes etc.
In hierarchical model, data is organized into tree-like structure with one one-to-many
relationship between two different types of data, for example, one department can have many
courses, many professors and of-course many students.

Network Model
The network database model allows each child to have multiple parents. It helps you to address
the need to model more complex relationships like as the orders/parts many-to-many
relationship. In this model, entities are organized in a graph which can be accessed through
several paths.
Relational model
Relational DBMS is the most widely used DBMS model because it is one of the easiest. This
model is based on normalizing data in the rows and columns of the tables. Relational model
stored in fixed structures and manipulated using SQL.
Ex: an instance of a student relation.

Sid Name Age Gpa

1201 Suresh 20 3.7

1202 Ramesh 21 3.6

1203 Maesh 19 3.6

1204 Naresh 20 3.8

1205 Haresh 21 3.7

The above relation contains four attributes sid, name, age,gpa and five tuples or records or rows.

Object-Oriented Model
In Object-oriented Model data stored in the form of objects. The structure which is called classes
which display data within it. It defines a database as a collection of objects which stores both
data members values and operations.

Entity Relationship Model: The entity-relationship (E-R) data model uses a collection of basic
objects, called entities, and relationships among these objects. An entity is a “thing” or “object”
in the real world that is distinguishable from other objects. A relationship is an association
among several entities. For example, a depositor relationship associates a customer with each
account that she has

Figure: A sample E-R diagram


View of Data:

A database system is a collection of interrelated data and a set of programs that allow users to
access and modify these data. A major purpose of a database system is to provide users with an
abstract view of the data. That is, the system hides certain details of how the data are stored and
maintained.

Data Abstraction

Database systems are made-up of complex data structures. To ease the user interaction with
database, the developers hide internal irrelevant details from users. This process of hiding
irrelevant details from user is called data abstraction.

A data definition language (DDL) is used to define the external and conceptual schemas

The data in a DBMS is described at three levels of abstraction

Physical level: This is the lowest level of data abstraction. It the physical schema summarizes
how the relations described in the conceptual schema are actually stored on secondary storage
devices such as disks and tapes.
The access methods like sequential or random access and file organization methods like B+
trees, hashing used for the same.
Suppose we need to store the details of an employee. Blocks of storage and the amount of
memory used for these purposes is kept hidden from the user.

Conceptual level: The next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The conceptual schema describes all
relations that are stored in the database. In our sample university database, these relations contain
information about entities, such as students and faculty, and about relationships, such as
students’ enrollment in courses.

Conceptual schema for university describes


Students(sid: string, name: string, login: string,age: integer, gpa: real)
Faculty(fid: string, fname: string, sal: real) Courses(cid: string, cname: string, credits: integer)
Rooms(rno: integer, address: string, capacity: integer)
Enrolled(sid: string, cid: string, grade: string)
Teaches(fid: string, cid: string)
Meets In(cid: string, rno: integer, time: string)
Fig: Levels of data abstraction

View level:
External schema allows data access to be customized (and authorized) at the level of individual
users or groups of users. Any given database has exactly one conceptual schema and one
physical schema because it has just one set of stored relations, but it may have several external
schemas. Each external schema consists of a collection of one or more views and relations from
the conceptual schema. A view is conceptually a relation, but the records in a view are not stored
in the DBMS, they are computed using a definition for the view, in terms of relations stored in
the DBMS.
For example, we might want to allow students to find out the names of faculty members teaching
courses, as well as course enrollments.
This can be done by defining the following view: Courseinfo (cid: string, fname: string,
enrollment: integer)
A user can treat a view just like a relation and ask questions about the records in the view. Even
though the records in the view are not stored explicitly, they are computed as needed. We did not
include Courseinfo in the conceptual schema because we can compute Courseinfo from the
relations in the conceptual schema, and to store it in addition would be redundant. Such
redundancy, in addition to the wasted space, could lead to inconsistencies.
Data Independence:
A very important advantage of using a DBMS is that it offers data independence. The ability to
modify schema definition in one level without affecting schema of that definition in the next
higher level is called data independence. Data independence is achieved through use of the three
levels of data abstraction;
There are two levels of data independence; they are Physical data independence and Logical data
independence.
Physical data independence is the ability to modify the physical schema without causing
application programs to be rewritten. Modifications at the physical level are occasionally
necessary to improve performance. It means we change the physical storage/level without
affecting the conceptual or external view of the data.
Logical data independence is the ability to modify the logical schema without causing
application program to be rewritten. Modifications at the logical level are necessary whenever
the logical structure of the database is altered.
For example, suppose that the Faculty relation in our university database is replaced by the
following two relations:
Faculty public(fid: string, fname: string, office: integer)
Faculty private(fid: string, sal: real)
The Courseinfo view relation can be redefined in terms of Faculty public and Faculty private,
which together contain all the information in Faculty, so that a user who queries Courseinfo will
get the same answers as before.

Fig: Data Independence


Instance and schema:
DBMS Schema Definition of schema:
Design of a database is called the schema. Schema is of three types: Physical schema, logical
schema and view schema.
For example: In the following diagram, we have a schema that shows the relationship between
three tables: Course, Student and Section. The diagram only shows the design of the database, it
doesn’t show the data present in those tables. Schema is only a structural view(design) of a
database as shown in the diagram below.

The design of a database at physical level is called physical schema, how the data stored in
blocks of storage is described at this level.
Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of data
records gets stored in data structures, however the internal details such as implementation of data
structure is hidden at this level (available at physical level).
Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
DBMS Instance
Definition of instance: The data stored in database at a particular moment of time is called
instance of database. Database schema defines the variable declarations in tables that belong to a
particular database; the value of these variables at a moment of time is called the instance of that
database.

For example, lets say we have a single table student in the database, today the table has 100
records, so today the instance of the database has 100 records. Lets say we are going to add
another 100 records in this table by tomorrow so the instance of database tomorrow will have
200 records in table. In short, at a particular moment the data stored in database is called the
instance, that changes over time when we add or delete data from the database.
Database Architecture
The architecture of a database system is greatly influenced by the underlying computer system
on which the database is running. The Database systems can be Centralized, Client-server,
Parallel (multi-processor), Distributed, the architecture database is divided into two types: Two-
tier architecture and Three-tier architecture.
Two-tier architecture: In Two-tier architecture the application resides at the client machine,
where it invokes database system functionality at the server machine through query language
statements. Application program interface standards like ODBC and JDBC are used for
interaction between the client and the server.
Example: Transaction performed by the bank employee after filling the deposit form by the
customer. Another example is railway reservation made by the employee in reservation counter.
Three-tier architecture: In three-tier architecture, the client machine acts as merely a front end
and does not contain any direct database calls. Instead, the client end communicates with an
application server, usually through a forms interface. The application server in turn
communicates with a database system to access data. The business logic of the application,
which says what actions to carry out under what conditions, is embedded in the application
server, instead of being distributed across multiple clients. Three-tier applications are more
appropriate for large applications, and for applications that run on the Worldwide Web.
Example: Amount transferred by the user through application program (Net banking) without
physically going to the bank. Another example is railway reservation made by the user from his
location without going to the reservation counter.
Three tier architecture is provides scalability when more no of clients or users interact with the
database.

Fig: Two-tier and three tier architectures


STRUCTURE OF A DBMS
Database Management System (DBMS) is software that allows access to data stored in a
database and provides an easy and effective method of defining, Storing, Manipulating,
Protecting information from system crashes or data theft, and differentiating access
permissions for different users.
The database system is divided into three components: Query Processor, Storage Manager,
and Disk Storage.

Fig: Structure of a DBMS


Database Users and User Interfaces
There are four different types of database-system users, differentiated by the way they expect to
interact with the system. Different types of user interfaces have been designed for the different
types of users.

Naive users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written by the application programmer previously.

For example, a user who wishes to find her account balance over the World Wide Web. Such a
user may access a form, where she enters her account number. An application program at the
Web server then retrieves the account balance, using the given account number, and passes this
information back to the user.

The typical user interface for naive users is a forms interface, where the user can fill in
appropriate fields of the form. Naive users may also simply read reports generated from the
database.
Application programmers are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces. Rapid
application development (RAD) tools are tools that enable an application programmer to
construct forms and reports without writing a program.
There are also special types of programming languages that combine imperative control
structures (for example, for loops, while loops and if-then-else statements) with statements of the
data manipulation language. These languages, sometimes called fourth-generation languages,
often include special features to facilitate the generation of forms and the display of data on the
screen. Most major commercial database systems include a fourth generation language.

Sophisticated users interact with the system without writing programs. Instead, they form their
requests in a database query language. They submit each such query to a query processor, whose
function is to break down DML statements into instructions that the storage manager
understands. Analysts who submit queries to explore data in the database fall in this category.

Online analytical processing (OLAP) tools simplify analysts’ tasks by letting them view
summaries of data in different ways. For instance, an analyst can see total sales by region (for
example, North, South, East, and West), or by product, or by a combination of region and
product (that is, total sales of each product in each region). The tools also permit the analyst to
select specific regions, look at data in more detail (for example, sales by city within a region) or
look at the data in less detail (for example, aggregate products together by category). Another
class of tools for analysts is data mining tools, which help them find certain kinds of patterns in
data.
Sophisticated users who write specialized database applications that do not fit into the traditional
data processing framework. Among these applications are computer-aided design systems,
knowledge- base and expert systems, systems that store data with complex data types (for
example, graphics data and audio data), and environment-modeling systems.
Chapters 8 and 9 cover several of these applications.
Database Administrator
One of the main reasons for using DBMSs is to have central control of both the data and the
programs that access those data. A person who has such central control over the system is called
a database administrator (DBA).

The functions of a DBA include:

Schema definition: The DBA creates the original database schema by executing a set of data
definition statements in the DDL.

Schema and physical-organization modification: The DBA carries out changes to the schema
and physical organization to reflect the changing needs of the organization, or to alter the
physical organization to improve performance.

Granting of authorization for data access: By granting different types of authorization, the
database administrator can regulate which parts of the data base various users can access. The
authorization information is kept in a special system structure that the database system consults
whenever someone attempts to access the data in the system.

Routine maintenance:

 Periodically backing up the database, either onto tapes or onto remote servers, to prevent
loss of data in case of disasters such as flooding.
 Ensuring that enough free disk space is available for normal operations, and upgrading
disk space as required.
 Monitoring jobs running on the database and ensuring that performance is not degraded
by very expensive tasks submitted by some users.

Storage Manager

A storage manager is a program module that provides the interface between the low level data
stored in the database and the application programs and queries submitted to the system. The
storage manager is responsible for the interaction with the file manager. The raw data are stored
on the disk using the file system, which is usually provided by a conventional operating system.
The storage manager translates the various DML statements into low-level file-system
commands. Thus, the storage manager is responsible for storing, retrieving, and updating data in
the database.

The storage manager components include:

Authorization and integrity manager: Tests for the satisfaction of integrity constraints and
checks the authority of users to access data.

Transaction manager: Ensures that the database remains in a consistent (correct) state despite
system failures, and that concurrent transaction executions proceed without conflicting.
File manager: Manages the allocation of space on disk storage and the data structures used to
represent information stored on disk.

Buffer manager: It is responsible for fetching data from disk storage into main memory, and
deciding what data to cache in main memory. The buffer manager is a critical part of the
database system, since it enables the database to handle data sizes that are much larger than the
size of main memory.

Data structures implemented by storage manager.

Data files: Stored in the database itself.

Data dictionary: Stores metadata about the structure of the database.

Indices: Provide fast access to data items.

The Query Processor:

The query processor components include


DDL interpreter, which interprets DDL statements and records the definitions in the data
dictionary.
DML compiler, which translates DML statements in a query language into an evaluation plan
consisting of low-level instructions that the query evaluation engine understands. A query can
usually be translated into any of a number of alternative evaluation plans that all give the same
result. The DML compiler also performs query optimization, that is, it picks the lowest cost
evaluation plan from among the alternatives.

Query evaluation engine, which executes low-level instructions generated by the DML compiler.
Introduction to Database Design

Database Design: The database design process can be divided into six steps.

Requirements Analysis: The very first step in designing a database application is to understand
what data is to be stored in the database, what applications must be built on top of it, and what
operations are most frequent and subject to performance requirements. In other words, we must
find out what the users want from the database.

This is usually an informal process that involves discussions with user groups, a study of the
current operating environment and how it is expected to change, analysis of any available
documentation on existing applications that are expected to be replaced or complemented by the
database, and so on. Several methodologies have been proposed for organizing and presenting
the information gathered in this step, and some automated tools have been developed to support
this process.

Conceptual Database Design: The information gathered in the requirements analysis step is
used to develop a high-level description of the data to be stored in the database, along with the
constraints that are known to hold over this data. This step is often carried out using the ER
model, or a similar high-level data model, and is discussed in the rest of this chapter.

Logical Database Design: We must choose a DBMS to implement our database design, and
convert the conceptual database design into a database schema in the data model of the chosen
DBMS. We will only consider relational DBMSs, and therefore, the task in the logical design
step is to convert an ER schema into a relational database schema.

Schema Refinement: The fourth step in database design is to analyze the collection of relations
in our relational database schema to identify potential problems, and to refine it. In contrast to
the requirements analysis and conceptual design steps, which are essentially subjective, schema
refinement can be guided by some elegant and powerful theory.

Physical Database Design: In this step we must consider typical expected workloads that our
database must support and further refine the database design to ensure that it meets desired
performance criteria. This step may simply involve building indexes on some tables and
clustering some tables, or it may involve a substantial redesign of parts of the database schema
obtained from the earlier design steps.

Security Design: In this step, we identify different user groups and different roles played by
various users (e.g., the development team for a product, the customer support representatives, the
product manager). For each role and user group, we must identify the parts of the database that
they must be able to access and the parts of the database that they should not be allowed to
access, and take steps to ensure that they can access only the necessary parts.
Entity-Relationship Model:

An Entity–relationship model (ER model) describes the structure of a database with the help of
a diagram, which is known as Entity Relationship Diagram (ER Diagram). An ER model is a
design or blueprint of a database that can later be implemented as a database.

The entity-relationship (E-R) data model perceives the real world as consisting of basic objects,
called entities, and relationships among these objects. It develops a conceptual design for the
database.

The E-R data model employs three basic notions: entity sets, relationship sets, and attributes.
For example, Suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc.

Fig: ER Model

Entity Sets:
An entity is a “thing” or “object” in the real world that is distinguishable from all other objects.
An entity has a set of properties, and the values for some set of properties may uniquely identify
an entity.
An entity set is a set of entities of the same type that share the same properties, or attributes. An
entity set is represented by a rectangle, and an attribute is represented by an oval.
For example, the Employees entity set could use name, social security number (ssn), and parking
lot (lot) as attributes.

name

ssn lot

Employees

Figure: Employees Entity Set


Weak Entity Type:

An entity which doesn’t have its key attribute, it is uniquely identified by the key attribute of
another entity set is called weak entity. It can be represented by using double rectangle.

For example: weak entity transactions is identified using ATM ID in ATM Transactions.

Fig: Weak entity

Simple and composite:

The attributes have been simple; that is, they are not divided into subparts. Composite attributes,
on the other hand, can be divided into subparts (that is, other attributes).

For example, an attribute address could be structured as a composite attribute consisting of


str_address, city, pincode.

Fig: Simple and composite:


Single-valued and multi valued attributes:

The attributes in our examples all have a single value for a particular entity. For instance, the
CustID attribute for a specific customer entity refers to only one customer. Such attributes are
said to be single valued.

There may be instances where an attribute has a set of values for a specific entity. Consider an
customer entity set with the attribute Custphone. A customer may have more than one phone
numbers. This type of attribute is said to be multi valued. The Multi valued attribute is modeled
in ER using double circle

Fig: Multi Valued Attribute

Derived attribute: The value for this type of attribute can be derived from the values of other
related attributes or entities. Derived attribute is modeled in ER using dotted ellipse.

Suppose that the student entity set has an attribute age, which indicates the students’s age. If the
student entity set also has an attribute date-of-birth, we can calculate age from date-of-birth, age
is a derived attribute.

Fig: Derived attribute


Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
Example: in student entity set id is the key attribute.

Relationship Sets:

A relationship is an association among several entities. A relationship set is a set of relationships


of the same type. Diamond or rhombus is used to represent the relationship.

The relationship set Works In, in which each relationship indicates a department in which an
employee works. Note that several relationship sets might involve the same entity sets. For
example, we could also have a Manages relationship set involving Employees and Departments.

Figure: Works in Relationship Set

A relationship can also have descriptive attributes. Descriptive attributes are used to record
information about the relationship, rather than about any one of the par- ticipating entities; for
example, we may wish to record that Attishoo works in the pharmacy department as of January
1991. This information is captured by adding an attribute, since, to Works In.

A relationship must be uniquely identified by the participating entities, without reference to


the descriptive attributes. In the Works In relationship set, for example, each Works In
relationship must be uniquely identified by the combination of employee ssn and department
did. Thus, for a given employee-department pair, we cannot have more than one associated
since value.
Degree of a relationship set:
The number of different entity sets participating in a relationship set is called as degree of a
relationship set.
Unary Relationship:
When there is only one entity set participating in a relation, the relationship is called as unary
relationship. For example, one person is married to only one person.

Binary Relationship:
When there are two entities set participating in a relation, the relationship is called as binary
relationship.For example, Student is enrolled in Course.

Ternary Relationship:

When there are 3 entities set participating in a relation, the relationship is called as Ternary

Relationship. In this example three entity sets participates in a relationship works_in3.

Fig: A Ternary Relationship Set


Degree of a relationship (Cordiality):

The number of times an entity of an entity set participates in a relationship set is known as
cardinality. Cardinality can be of different types:

1. One to One Relationship


When a single instance of an entity is associated with a single instance of another entity then it is
called one to one relationship. For example, Let us assume that a male can marry to one female
and a female can marry to one male. So the relationship will be one to one.

Fig: One to One Relationship

Using Sets, it can be represented as:

2. One to-Many Relationship


When a single instance of an entity is associated with more than one instances of another entity
then it is called one to many relationship. For example – a customer can place many orders but a
order cannot be placed by many customers.

Using Sets, it can be represented as:


3. Many to One Relationship
When more than one instances of an entity is associated with a single instance of another entity
then it is called many to one relationship. For example – many students can study in a single
college but a student cannot study in many colleges at the same time.

Using Sets, it can be represented as:

4. Many to-Many Relationship


When more than one instances of an entity is associated with more than one instances of another
entity then it is called many to many relationship. For example, a student can enrolled in multiple
courses and a course can be enrolled by many students.

Using Sets, it can be represented as:


Additional features of ER-Model:
Some of the constructs in the ER model us to describe some fine properties of data.

1. Key constraints:
The values of the attribute, values of an entity such that uniquely identify the entity.
Keys also help to identify relationship uniquely, thus distinguish relationship from each
other.
For example consider the relationship set called manages between the employees and
department’s entity sets such that each department has at most one manager, although
single employee is allowed to manage more than one department. The restriction each
department has at most one manager is the key constraint. This constraint represented in
ER diagram using arrow from Departments to Manages. Here ssn is key attribute of
employees entity set, did is key attribute of departments entity set

ssn
name since did dname budget
lot

Manages Departments
Employees

Fig: Key constraints on Manages

Participation Constraint:
Participation Constraint is applied on the entity participating in the relationship set.
1. Total Participation – Each entity in the entity set must participate in the relationship. If
each student must enroll in a course, the participation of student will be total. Total
participation is shown by double line in ER diagram.
2. Partial Participation – The entity in the entity set may or may not participate in the
relationship. If some courses are not enrolled by any of the student, the participation of
course will be partial.
The diagram depicts the ‘Enrolled in’ relationship set with Student Entity set having total
participation and Course Entity set having partial participation.
Weak entities:
An entity can be identified uniquely only by considering some of its attributes in conjunction
with the primary key of another entity is called weak entity.
It must hold the following restrictions
The owner entity set and the weak entity set must participate in one to-many relationship set.
This relationship set is called identifying relationship set of the weak entity set. The weak entity
set must have the total participation in the identifying relationship set.

For example a dependents entity can be identified uniquely only if we take the key of the owing
entity employees. Here the relationship between dependents and policy is total, it indicates that
each dependent entity is appears in at most one policy relationship

pname age
ssn lot cost
name

Employees Policy Dependents

Fig: Departments weak entity set

Class Hierarchies:
Class hierarchies are used to classify the entities of entity set into sub classes. It represented
using ISA (is a) relationship and can be viewed in two ways.
Generalization: Generalization is the process of extracting common properties from a set of
entities and creates a generalized entity from it. It is a bottom-up approach in which two or more
entities can be generalized to a higher level entity if they have some attributes in common. For
Example,
STUDENT and FACULTY can be generalized to a higher level entity called PERSON as shown
in Figure 1. In this case, common attributes like P_NAME, P_ADD become part of higher entity
(PERSON) and specialized attributes like S_FEE become part of specialized entity (STUDENT).

Specialization: In specialization, an entity is divided into sub-entities based on their


characteristics. It is a top-down approach where higher level entity is specialized into two or
more lower level entities. For Example, EMPLOYEE entity in an Employee management system
can be specialized into DEVELOPER, TESTER etc. as shown in Figure 2. In this case, common
attributes like E_NAME, E_SAL etc. become part of higher entity (EMPLOYEE) and
specialized attributes like TES_TYPE become part of specialized entity (TESTER).

Aggregation: Aggregation allows us to indicate that the relationship set participates in another
relationship set. An ER diagram is not capable of representing relationship between an entity and
a relationship which may be required in some scenarios. In those cases, a relationship with its
corresponding entities is aggregated into a higher level entity.
For Example, Employee working for a project may require some machinery. So, REQUIRE
relationship is needed between relationship WORKS_FOR and entity MACHINERY. Using
aggregation, WORKS_FOR relationship with its entities EMPLOYEE and PROJECT is
aggregated into single entity and relationship REQUIRES is created between aggregated entity
and MACHINERY.
Example2: suppose that we have entity set called projects and each project entity is sponsored
by one or more departments. A department that sponsors a project might assign employees to
monitor the sponsorship.
Ssn name lot

Employees

Monitors until

dname budget
since
did
Started_o
pid pbudget
n

Projects sponsors Departments

Fig: Aggregation
Conceptual Design with the ER Model:

Developing and ER diagram presents several choices

1. Should a concept be modeled as an entity or an attribute?

2. Should a concept be modeled as an entity or relationship?

3. What are the relationship sets and their participating entity sets? Should we use binary or
ternary relationships?

4. Should we use aggregation?

A concept can be modeled as an entity or an attribute:

Sometimes it is not clear whether a property should be modeled as an attribute or an entity set.
We have to record only one value; it can be modeled as attribute. Modeled as Entity set, we
have to record more than one value and we want to capture the structure of property in ER
model.

For example: to model a concept as an entity set rather than an attribute an attribute. Employees
works for a department, it records the interval during an employee works for a department.

did dname
name to e budget
Ssn lot from

Employees Works_in Departments

Fig: Works_In relationship set

Suppose an employee works a department more than one period, the problem is that we want to
record several values for descriptive attributes for each instance of the works_in relationship.
We could define duration as entity set in association with works_in relationship.
did name budget

Ssn name
lot

Departments
Employees Works_in

from Duration to

Fig: Works_In relationship set

A concept can be modeled as an entity or relationship:

Consider the relationship manages suppose each department manager is given a discretionary
budget.

ssn dname budget


name did
lot
since dbudget

Employees Manages Departments

Fig: Entity versus relationships

If the discretionary budget is sum that covers all departments managed by employee, in this case
manages relationship that involves a given employee will have in the dbudget field. It leading to
redundant storage of same information.

This problem can be addressed by adding a new entity set called managers is a hierarchy with
employees entity set, with dbudget is an attribute of manager, and since is an attribute of
manages relationship.
dname
ssn did
name lot budget
since

Employees Manages Departments

ISA

Manager dbudget

Fig: Entity versus relationships

A concept can be modeled as binary or ternary relationships:

Consider the ER diagram an employee can own several policies, each policy can be owned by
several employees, and each dependent covered by several policies, and dependents entity must
be deleted if the owning employees entity is deleted.

ssn lot pname age


name

Employees covers Dependents

pid policy cost

Fig: policy as entity set

Suppose we have the following requirements:


 A policy cannot be owned jointly by two or more employees.
 Every policy must be owned by some employee.
 Dependent is a week entity set, and each dependent entity is uniquely identified by taking
pname in conjunction with policyid of policy entity.
ssn pname age
name lot

Employees Dependents
Benficiary

purchaser

pid policy cost

Fig: policy revisited

A concept can be modeled as ternary or Aggregation:

We can model Concept a project can be sponsored by any number of number of departments, a
department can sponsor one or more projects, and each sponsorship is monitored by one or more
employees using aggregation.

Ssn name lot

Employees

Monitors until

Started_on

pid pbudget dname


did budget
since

Departments
Projects sponsors

Fig: Aggregation
If we don’t need to record the until attribute of monitors, i.e we can’t express the constraint each
sponsorship is monitored by at most one employee, then we might use ternary relationship.

ssn name lot

Employees

dname
did
Started_on budget
pbudget
pid

Projects sponsors Departments

Fig: using a ternary relationship instead of aggregation.

ER Diagrams to practice:
ER diagram of Bank Management System

ER diagram is known as Entity-Relationship diagram. It is used to analyze to structure of the


Database. It shows relationships between entities and their attributes. An ER model provides a
means of communication.
ER diagram of Bank has the following description :
 Bank have Customer.
 Banks are identified by a name, code, address of main office.
 Banks have branches.
 Branches are identified by a branch_no., branch_name, address.
 Customers are identified by name, cust-id, phone number, address.
 Customer can have one or more accounts.
 Accounts are identified by acc_no., acc_type, balance.
 Customer can avail loans.
 Loans are identified by loan_id, loan_type and amount.
 Account and loans are related to bank’s branch.
ER diagram for Library Management system
ER diagram for online shopping portal:

You might also like