0% found this document useful (0 votes)
16 views

Unit I Introtodbms

Introduction to DBMS

Uploaded by

shubhangi.ladde
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Unit I Introtodbms

Introduction to DBMS

Uploaded by

shubhangi.ladde
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 160

DEPARTMENT OF INFORMATION TECHNOLOGY

SINHGAD COLLEGE OF ENGINEERING, PUNE

DATABASE MANAGEMENT
SYSTEM
2015 Course

Prof.K.B.Sadafale
Asst.Professor
SCOE,Pune
Teaching Scheme :- Lectures: 4 Hrs/ Week

Examination Scheme:-

In-Semester Assessment
Phase I – 30 Marks

End-Semester Assessment
Phase II – 70 Marks
Course Objectives
1. Understand the fundamental concepts of database management.
These concepts include
 aspects of database design, database languages, and database-system
implementation.
2. To provide a strong formal foundation in database concepts,
technology and practice.
3. To give systematic database design approaches covering
conceptual design, logical design and an overview of physical
design.
4. Be familiar with the basic issues of transaction processing and
concurrency control.
5. To learn and understand various Database Architectures and
Applications.
6. Understand how analytics and big data affect various functions
now and in the future.
Course Outcomes
1. Define basic functions of DBMS & RDBMS.
2. Analyze database models & Entity Relationship
Models.
3. Design and implement a database schema for a given
problem-domain
4. Populate and query a database using SQL DML/DDL
commands.
5. Programming PL/SQL including stored procedures,
stored functions, cursors and packages
6. Appreciate the impact of analytics and big data on the
information industry and the external ecosystem for
analytical and data services
Syllabus
UNIT I. INTRODUCTION TO DBMS

UNIT II. DATABASE DESIGN AND SQL

UNIT III. QUERY PROCESSING AND DATABASE TRANSACTIONS

UNIT IV. CONCURRENCY CONTROL AND ADVANCED DATABASES

UNIT V. LARGE SCALE DATA MANAGEMENT

UNIT VI. DATA WAREHOUSING AND DATA MINING

.
UNIT I. Introduction
Introduction:
Database Concepts, Database System Architecture, Data Modeling: Data Models, Basic Concepts,
entity, attributes, relationships, constraints, keys.
E-R and EER diagrams:
Components of E-R Model, conventions, converting E-R diagram into tables, EER Model
components, converting EER diagram into tables, legacy system model.
Relational Model:
Basic concepts, Attributes and Domains, Codd's Rules.
Relational Integrity:
Domain, Entity, Referential Integrities, Enterprise Constraints, Schema Diagram.
Relational Algebra:
Basic Operations, Selection, projection, joining, outer
join, union, difference, intersection, Cartesian product,
division operations (examples of queries in relational
algebraic using symbols).
Text Books

1. Silberschatz A., Korth H., Sudarshan S., "Database System


Concepts", 4th Edition, McGraw Hill Publishers, 2002, ISBN 0-
07-120413-X

2. Elmasri R., Navathe S., "Fundamentals of Database


Systems", 4th Edition, Pearson Education, 2003, ISBN
8129702282.

S.K.Singh, “Database Systems : Concepts, Design and


Application”, 2nd Edition, Pearson, 2013, ISBN 978-81-317-
6092-5
Unit I : Introduction
Data vs. information:
What is the difference?
What is data? What is information?
 Information is data that have
 Data is raw, unorganized
been organized and
facts that need to be communicated in a coherent
processed and meaningful manner.
 Information science defines  Data is converted into
data as unprocessed information, and information is
information. converted into knowledge.
 Knowledge; information
 Ex: Each student’s test
evaluated and organized so
score is one piece of data. that it can be used
purposefully.
 Ex: The class average score
is the information that can be
concluded from the given data
What is a database

 A database is any organized collection of data.

 A database is a collection of related information.


 For example, a phone book is a database of names,
addresses and phone numbers.

 A data file is a single disk file that stores related


information on a hard disk or floppy diskette.
 For example, a phone book database would be stored in
a single data file.
 Some examples of databases you may encounter in your
daily life are:

 a telephone book
 T.V. Guide
 airline reservation system
 motor vehicle registration records
 papers in your filing cabinet
 files on your computer hard drive
Why do we need a database?
Keep records of our:
 Clients
 Staff
 Volunteers
To keep a record of activities;
Keep sales records;
Develop reports;
Perform research
Longitudinal tracking
What is the ultimate purpose of
a database management
system?
Is to transform

Data Information Knowledge Action


What Is a DBMS?
A very large, integrated collection of data and a set of programs to
access those data.
Supports efficient access to very large amounts of data.
Supports concurrent access to very large amounts of data.
Example: bank and its ATM machines
Supports secure, atomic access to very large amounts of data.

The collection of data , usually referred to as the database.

A Database Management System (DBMS) is a software package


designed to store and manage databases.

Management of data involves both defining structures for storage


of information and provide mechanism for the manipulation of
information .
Why Use a DBMS?
The goal of a DBMS is to provide an environment that is both
convenient and efficient to use in

Retrieving information from the database.


Storing information into the database.
Data independence and efficient access.
Reduced application development time.
Data integrity and security.
Uniform data administration.
Concurrent access, recovery from crashes.
Databases are used…
To manipulate information so that it can be
sorted and/or searched.

To make record keeping and tracking fast and


efficient.
Database System Applications
Databases are widely used some representative
applications:
Banking:
 For customer information, accounts, and loans, and
banking transactions.
Airlines:
 For reservations and schedule information.
Universities:
 For student information, course registrations, and grades.
Credit card transactions:
 For purchases on credit cards and generation of monthly
statements.
Telecommunication:
 For keeping records of calls made, generating monthly bills, maintaining
balances on prepaid calling cards, and storing information about the
communication networks.
Finance:
 For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds.
Sales:
 For customer, product, and purchase information.
Manufacturing:
 For management of supply chain and for tracking production of items in
factories, inventories of items in warehouses/stores, and orders for items.
Human resources:
 For information about employees, salaries, payroll taxes and benefits, and for
generation of paychecks.
Who uses databases?

Almost everyone:
 Business

 Doctors

 Teachers

 Students
Purpose of Database System
1.let's look at a typical file-processing system" supported by a
conventional operating system.

The application is a savings bank:


Savings account and customer records are kept in permanent
system files.
Application programs are written to manipulate files to perform
the following tasks:-
Debit or credit an account.
Add a new account.
Find an account balance.
Generate monthly statements.
2.Development of the system proceeds as follows:

New application programs must be written as the need arises.

New permanent files are created as required.

 but over a long period of time files may be in different


formats, and

Application programs may be in different languages.
Drawbacks of typical file processing system over Database
Management System-
 Data redundancy and inconsistency

Same information may be duplicated in several places.


All copies may not be updated properly.
 Difficulty in accessing data

May have to write a new application program to satisfy


an unusual request.
E.g. find all customers with the same postal code.
Could generate this data manually, but a long job...
 Data isolation

Data in different files.


Data in different formats.
Difficult to write new application programs.
 Multiple users
Want concurrency for faster response time.
Need protection for concurrent updates.
E.g. two customers withdrawing funds from the same
account at the same time - account has $500 in it, and
they withdraw $100 and $50. The result could be $350,
$400 or $450 if no protection.
 Security problems
Every user of the system should be able to access only
the data they are permitted to see.
E.g. payroll people only handle employee records, and
cannot see customer accounts; tellers only access
account data and cannot see payroll data.
Difficult to enforce this with application programs.
 Integrity problems
Data may be required to satisfy constraints.
E.g. no account balance below $25.00.
Again, difficult to enforce or to change constraints with
the file-processing approach.

These problems and others led to the development of


database management systems.
Data Abstraction
A database system is a collection of interrelated data and set of
programs that allow users to access and modify these data.

 The major purpose of a database system is to provide users


with an abstract view of the system.

 The system hides certain details of how data is stored and


maintained .

 Complexity should be hidden from database users.


 There are several levels of abstraction:

 1. Physical Level:

How the data are stored.

E.g. index, B-tree, hashing.

Lowest level of abstraction.

Complex low-level structures described in detail.


 2. Conceptual Level:

Next highest level of abstraction.

Describes what data are stored.

Describes the relationships among data.

Database administrator level.


 3.View Level:

Highest level.

Describes part of the database for a particular group of users.

Can be many different views of a database.

E.g. tellers in a bank get a view of customer accounts, but not


of payroll data.
 Three level of Data abstraction

View 1 View 2 View n

Conceptual Level

Physical Level
The Elements of a Database

The database schema


Schema objects
Indexes
Tables
Fields and columns
Records and rows
The database schema

A schema is quite simply a group of related objects in


a database.
Within a schema, objects that are related have
relationships to one another.

Figure - Collection of objects that comprise a database schema.


Table
A table is the primary unit of physical storage for data in a database.

 When a user accesses the database , a table is usually referenced for


the desired data.

 Multiple tables might comprise a database , therefore a relationship


might exist between tables.

Figure illustrates tables in a schema. Each table in the figure is related


to at least one other table. Some tables are related to multiple tables.

Figure - Database tables and their relationships.


relationships
Columns
A column, or field, is a specific category of information that exists
in a table.
A column is to a table what an attribute is to an entity.
In other words, when a business model is converted into a database
model, entities become tables and attributes become columns.
A column represents one related part of a table and is the smallest
logical structure of storage in a database.
Each column in a table is assigned a data type.
The assigned data type determines what type of values that can
populate a column.
When visualizing a table, a column is a vertical structure in the
table that contains values for every row of data associated with a
particular column.
A Field
(category) is the place where one item of
information is recorded; the smallest part of the
database.
Fields
Fields
In Figure , columns within the Customers table are shown.

Each column is a specific category of information.

 All of the data in a table associated with a field is called a


column.

Figure- Columns in a database table.


Rows
A row of data is the collection of all the columns in a table
associated with a single occurrence.
Simply speaking, a row of data is a single record in a
table.
For example, if there are 25,000 book titles with which a
bookstore deals, there will be 25,000 records, or rows of
data, in the book titles table once the table is populated.
The number of rows within the table will obviously change
as books' titles are added and removed.

Figure - Row of data in a database table.


A RECORD
One row is a one record.
Record
Instances and Schemas
Instances :-Instances is collection of the Information store in the database at
a particular moment is called Instances.

It is also called database state.

The term instance is typically used to describe a complete database


environment, including the RDBMS software, table structure, stored
procedures and other functionality.

It is most commonly used when administrators describe multiple instances


of the same database.

Examples: An organization with an employees database might have three


different instances: production (used to contain live data), pre-production
(used to test new functionality prior to release into production) and
development (used by database developers to create new functionality).
Schemas:-
The description of data base is called database schema.

Overall design of data base.

Schema contains 'No of records + Type of data + No of


attributes.

Description of data at some level.

Each level has its own schema

It is specified during database design and is not expected to


change frequently.

A displayed schema is called schema diagram.


Schema diagram for UNIVERSITY database

schema construct

Known data:
name of record types, data items
Example: University Database
Conceptual schema:

Students(sid: string, name: string, login: string, age: integer, gpa:real)

Courses(cid: string, cname:string, credits:integer)

Enrolled(sid:string, cid:string, grade:string)

Physical schema:

Relations stored as unordered files.

Index on first column of Students.

External Schema (View):

Course_info(cid:string, enrollment:integer)

CS542Students(sid: string, grade:string)


Database Languages
A database system provides a Data-Definition
Language (DDL) to specify the database schema.

A Data-Manipulation Language (DML) to express


database queries and updates.

DDL and DML are not two separate languages ,instead


they simply and widely called as SQL language.
Data Definition Language (DDL)
Used to specify a database scheme as a set of definitions expressed in a
DDL.

The DDL , just like any programming languages , gets an input some
instructions and generates some output.

The output of DDL is placed in the data dictionary.

The data directory contains metadata (data about data).

The data dictionary is considered to be a special type of table, which can


only be accessed and updated by the database system itself.

A database system consults the data dictionary before reading or


modifying actual data.
The data values stored in database must satisfy
certain consistency constraints.

E.g.-suppose the balance on an account should not


fall below $1000.

The DDL provides facilities to specify such


constraints.

The database systems check these constraints every


time the database updated.
Domain Constraints –
A domain of possible values must be associated with every
attributes (e.g. integer type, character type, date type).
Domain constraints are the most elementary form of integrity
constraints.
Attributes are tested easily by the system whenever a new data
item is inserted into the database.

Referential Integrity –
There are cases where we wish to ensure that a value that
appears in relation for a given set of attributes also appears for
a certain set of another relation.
Database modifications can cause violations of referential
integrity.
When a referential integrity constraints is violated , the normal
procedure is to reject the action that caused the violation.
Assertions –
An assertion is any condition that the database must always
satisfy.
Domain constraints and referential integrity constraints are special
forms of assertion.
There are many constraints that we cannot express by using only
these special forms.
E.g. “Every loan has at least one customer who maintains an
account with minimum balance of $10,000” must be expressed as
an assertion.
Authorization –
We may want to differentiate among the users as far as the type of
access they permitted on various data values in database.
These differentiation are expressed in terms of authorization.
E.g. Read authorization, insert authorization , delete
authorization , update authorization
Data Manipulation Language (DML)
A data manipulation language (DML) is a language that enables
users to access or manipulate data.
The goal is to provide efficient human interaction with the
system.
Data Manipulation is:

Retrieval of information from the database

Insertion of new information into the database

Deletion of information in the database

Modification of information in the database


There are two types of DML: -

Procedural DMLs – require a user to specify what data are


needed and how to get those that data.
Procedural languages are used in the traditional programming
that is based on algorithms or a logical step by step process for
solving a problem.

non procedural DMLs –require a user to specify what data are


needed without specifying how to get those data.
Non procedural languages allows users and professional
programmers to specify the results they want without
specifying how to solve the problem
Query:-
in general , a query is a question.
In computers , what a user of a search engine or database enter
is sometime called the Query.
A query is a statement requesting the retrieval of information.
Database query can be either a select query or an action query.

Query Language:-
Language used to interact with databases are called query
languages, of which SQL is well known standard.
The portion of a DML that involves information retrieval is
called a query language.
It is a common practice to use the terms query language and
data manipulation language synonymously.
Data Models
Data model is a collection of concepts that can be used to describe the
structure of a database.
Data models are a collection of conceptual tools for describing data, data
relationships, data semantics(relation between stored data and real world)
and data constraints
Business Model is a plan implemented by a company to generate revenue
and make a profit from operations
The model includes the components and functions of the business , as well
as the revenues it generates and the expenses it incurs.
There are three different Models

 Object-based Logical Models.

 Record-based Logical Models.

 Physical Data Models.


Object-based Logical Models
 Object-based logical models:
 The object based models use the concepts of entities or objects and
relationships among them rather than the implementation based
concepts such as records used in the record based models
Describe data at the conceptual and view levels.
Provide fairly flexible structuring capabilities.
Allow one to specify data constraints explicitly.
 such models,
Entity-relationship model.
Object-oriented model
Binary model.
Semantic data model
Functional data model
The E-R Model
An entity-relationship model (ERM) is an abstract and
conceptual representation of data.

The entity-relationship model (or ER model) is a way of


graphically representing the logical relationships of entities (or
objects) in order to create a database.

Entity-relationship modeling is a database modeling method,


used to produce a type of conceptual schema or semantic data
model of a system.
Diagrams created by this process are called entity-
relationship diagrams, ER diagrams, or ERDs.

The entity-relationship model is based on a perception of the


world as consisting of a collection of basic objects (entities
and relationships among these objects.)

The ER model was developed to facilitate database design by


allowing specification of an enterprise schema that represents
the overall logical structure of database.
 Entity:-

An entity is a distinguishable object that exists.


Entity is a person , place , thing or concept about which data
can be collected
Example: House ,Car , Employee

Each entity has associated with it a set of attributes describing


it.
E.g. number and balance form an account entity

Entities have attributes



Example: people have names and addresses
Entity
Entity Sets
An entity set is a set of entities of the same type that share the
same properties.
Entity set is collection or set all entities of a particular entity
type at any point of time.
 Example: set of all persons, companies, trees, holidays

Entity Sets customer and loan


customer-id customer- customer- customer- loan- amount
name street city number
 John
Smith
Sana
Lee

Student entity set has four entities john, smith , sana , Lee
Attributes
An entity is represented by a set of attributes, that is
descriptive properties possessed by all members of an
entity set.
Example:
customer = (customer-id, customer-name,
customer-street, customer-city)

loan = (loan-number, amount)

Domain – the set of permitted values for each


attribute
Attributes of
Product

Attributes of
Supplier
Attributes
 Attribute types:-
 Simple and composite attributes.
 Single-valued and multi-valued attributes
 E.g. multivalued attribute: phone-numbers
 Derived attributes
 Can be computed from other attributes
 E.g. age, given date of birth
Composite Attributes
Relationship:-
A relationship is an association among several entities.
e.g. A cust_acct relationship associates a customer with each
account he or she has.
The set of all relationships of the same type is called the
relationship set.

Example:
Hayes depositor A-102
customer entity relationship set account entity

A relationship set is a mathematical relation among n  2


entities, each taken from entity sets
{(e1, e2, … en) | e1  E1, e2  E2, …, en  En}
where (e1, e2, …, en) is a relationship

Example: (Hayes, A-102)  depositor


Relationship Set borrower
Relationship Sets (Cont.)
An attribute can also be property of a relationship set.

For instance, the depositor relationship set between entity sets


customer and account may have the attribute access-date
Degree of a Relationship Set
Refers to number of entity sets that participate in a relationship
set.
Relationship sets that involve two entity sets are binary (or
degree two). Generally, most relationship sets in a database
system are binary.
Relationship sets may involve more than two entity sets.

E.g. Suppose employees of a bank may have jobs (responsibilities) at multiple


branches, with different jobs at different branches. Then there is a ternary
relationship set between entity sets employee, job and branch

Relationships between more than two entity sets are rare. Most
relationships are binary.
Ternary relationship set
Branch
city
Branch
Name Assets

Branch
Social
security Street
Account
Balance
No
Cust city
Name

Customer Account
Job
Mapping Cardinalities
Express the number of entities to which another entity can be
associated via a relationship set.
Most useful in describing binary relationship sets.
For a binary relationship set the mapping cardinality must be
one of the following types:
 One to one
 One to many
 Many to one
 Many to many
Mapping Cardinalities

One to one One to many

Note: Some elements in A and B may not be mapped to any elements in the
other set
Mapping Cardinalities

Many to one Many to many


Note: Some elements in A and B may not be mapped to any elements in the other
set
The one to one relationship 1:1
Examples:
1. In South Africa a person can have one and only one
Identity Document. Also, a specific Identity Document can
belong to one and only one person.

2. A voter can cast only one vote in an election.


A ballot paper can belong to only one voter.
So there will be a 1:1 relationship between a Voter and a
Ballot Paper.
The one to many or many to one relationship 1:m or m:1
Examples:
1. Master-detail. You have a master (or header) record with many detail
records. For example an order. There will be a master record with the order
date, person placing the order, etc. And then detail records of everything
ordered. The master record will have many detail, and the detail will have only 1
master.

2. Supervisor-subordinates. A supervisor will have one or many subordinates. A


subordinate will have only 1 manager.

3. Mother-Children. A child will have only 1 (biological) mother. A mother will


have zero, one, or many children.

4. Division- department. A division will have one or many departments. A


department will belong to only 1 division.

5. A person can own more than one car. A car can only have one owner. So
Owner to Car will be a one to many or 1:M relationship.

6. A person can buy one or many movie tickets. A ticket will belong to only one
person.
The many to many relationship m:m
Examples:
1. Student - professor. A student will have one or more
professors. The same professor will have lots of students.

2. Partent - child. A parent can have zero, one, or many


children. A child will have more than one (biological) parent.

3. At a hospital a patient will be assigned to a coupled of


nurses. A specific nurse will be assigned to 1 or many
patients.

4. A student will have lots of subjects and the same subject


can be taken by lots of students
Database Modeling and Implementation
Process

Ideas ER Design Relational Schema

Relational DBMS
Implementation
E-R Diagrams

 Rectangles represent entity sets.


 Diamonds represent relationship sets.
 Lines link attributes to entity sets and entity sets to relationship sets.
 Ellipses represent attributes
 Double ellipses represent multivalued attributes.
 Dashed ellipses denote derived attributes.
 Underline indicates primary key attributes (will study later)
E-R Diagram With Composite, Multivalued, and
Derived Attributes
Relationship Sets with Attributes
Roles
The labels “manager” and “worker” are called roles.
They specify how employee entities interact via the
works-for relationship set.
Roles are indicated in E-R diagrams by labeling the lines
that connect diamonds to rectangles.
Role labels are optional, and are used to clarify semantics
of the relationship
Cardinality Constraints
We express cardinality constraints by drawing either a
directed line (), signifying “one,” or an undirected line
(—), signifying “many,” between the relationship set and
the entity set.
E.g.: One-to-one relationship:
 A customer is associated with at most one loan via the

relationship borrower
 A loan is associated with at most one customer via

borrower
One-To-Many Relationship
In the one-to-many relationship a loan is associated with at
most one customer via borrower, a customer is associated
with several (including 0) loans via borrower
Many-To-Many Relationship

A customer is associated with several (possibly 0)


loans via borrower
A loan is associated with several (possibly 0)
customers via borrower
Participation of an Entity Set in a Relationship Set
 Total participation (indicated by double line): every entity in the entity set
participates in at least one relationship in the relationship set
 E.g. participation of loan in borrower is total
 every loan must have a customer associated to it via borrower
 Partial participation: some entities may not participate in any relationship in
the relationship set
 E.g. participation of customer in borrower is partial
Alternative Notation for Cardinality Limits

 Cardinality limits can also express participation constraints


Keys
The key is defined as the column or attribute of the database
table.
The following are the various types of keys available in the
DBMS system.
Super Key –
An attribute or a combination of attribute that is used to
identify the records uniquely is known as Super Key.
A table can have many Super Keys.
E.g. of Super Key
1 ID
2 ID, Name
3 ID, Address
4 ID, Department_ID
5 ID, Salary
6 Name, Address
7 Name, Address, Department_ID
………… So on as any combination which can identify the records
uniquely will be a Super Key.
Candidate Key –
It can be defined as minimal Super Key or irreducible Super Key.
In other words an attribute or a combination of attribute that
identifies the record uniquely but none of its proper subsets can
identify the records uniquely.

E.g. of Candidate Key

Employee (‘ID’ , ‘Name’ ,’Address’)

For above table we have only two Candidate Keys


‘ID’ Key can identify the record uniquely.
similarly combination of Name and Address can identify the record
uniquely.
Neither Name nor Address can be used to identify the records
uniquely .
Primary Key –
A Candidate Key that is used by the database designer for unique
identification of each row in a table is known as Primary Key.
A Primary Key can consist of one or more attributes of a table.

E.g. of Primary Key -


Customer
CustomerNo FirstName LastName
1 Sally Thompson
2 Sally Henderson
3 Harry Henderson
4 Sandra Wellington
For example, in the table above, CustomerNo is the primary key.
Foreign Key –
A foreign key is an attribute or combination of attribute in one base
table that points to the candidate key (generally it is the primary key) of
another table.
The purpose of the foreign key is to ensure referential integrity of the
data i.e. only values that are supposed to appear in the database are
permitted.
for example, OrderNo is the primary key of the table ORDERS below
and CustomerNo is a foreign key that points to the primary key in the
CUSTOMERS table.
ORDERS
OrderNo EmployeeNo CustomerNo Supplier Price Item
1 1 42 Harrison $235 Desk
2 4 1 Ford $234 Chair
3 1 68 Harrison $415 Table
4 2 112 Ford $350 Lamp
5 3 42 Ford $234 Chair
6 2 112 Ford $350 Lamp
7 2 42 Harrison $235 Desk
Composite Key –
If we use multiple attributes to create a Primary Key then that
Primary Key is called Composite Key (also called a Compound Key
or Concatenated Key).
E.g. of Composite Key, if we have used “Name, Address” as a
Primary Key then it will be our Composite Key.
Alternate Key –
Alternate Key can be any of the Candidate Keys except for the
Primary Key.
E.g. of Alternate Key is “Name, Address” as it is the only other
Candidate Key which is not a Primary Key.
Secondary Key –
The attributes that are not even the Super Key but can be still used for
identification of records (not unique) are known as Secondary Key.
E.g. of Secondary Key can be Name, Address, Salary,
Department_ID etc. as they can identify the records but they might
not be unique
E-R Diagram with a Ternary
Relationship
Converting Non-Binary Relationships to Binary
Form
In general, any non-binary relationship can be represented using binary
relationships by creating an artificial entity set.
 Replace R between entity sets A, B and C by an entity set E, and three
relationship sets:
1. RA, relating E and A 2.RB, relating E and B
3. RC, relating E and C
 Create a special identifying attribute for E
 Add any attributes of R to E

For each relationship (ai , bi , ci) in R, create
1. a new entity ei in the entity set E 2. add (ei , ai ) to RA
3. add (ei , bi ) to RB 4. add (ei , ci ) to RC
Weak Entity Sets
An entity set that does not have a primary key is referred to as a
weak entity set.

Identifying relationship depicted using a double diamond

The discriminator (or partial key) of a weak entity set is the set
of attributes that distinguishes among all the entities of a weak
entity set.

The primary key of a weak entity set is formed by the primary


key of the strong entity set on which the weak entity set is
existence dependent, plus the weak entity set’s discriminator.
Weak Entity Sets (Cont.)
We depict a weak entity set by double rectangles.
We underline the discriminator of a weak entity set with a
dashed line.
payment-number – discriminator of the payment entity set
Primary key for payment – (loan-number, payment-number)
More Weak Entity Set Examples
In a university, a course is a strong entity and a
course-offering can be modeled as a weak entity

The discriminator of course-offering would be


semester (including year) and section-number (if
there is more than one section)
Extended Entity-Relationship Model
• Since 1980s there has been an increase in emergence of
new database applications with more demanding
requirements.

• Basic concepts of ER modelling are not sufficient to


represent requirements of newer, more complex
applications.

• Response is development of additional ‘semantic’


modelling concepts.
The enhanced entity-relationship (EER) model is a high-level
or conceptual data model incorporating extensions to the original
entity-relationship (ER) model, used in the design of databases.

 It was developed to reflect more precisely the properties and


constraints that are found in more complex databases.

The EER model includes all of the concepts introduced by the


ER model.
Additionally it includes the concepts of a subclass and super
class, along with the concepts of specialization and generalization.
Subclasses and Super classes
An entity type may have additional meaningful sub groupings of its entities.

Example: EMPLOYEE may be further grouped into


SECRETARY,
ENGINEER,
MANAGER,
TECHNICIAN,
SALARIED_EMPLOYEE,
HOURLY_EMPLOYEE,…
 Each of these groupings is a subset of EMPLOYEE entities
 Each is called a subclass of EMPLOYEE
 EMPLOYEE is the super class for each of these subclasses

These are called super class/subclass relationships.

Example: EMPLOYEE/SECRETARY,
EMPLOYEE/TECHNICIAN
Another example of super class and sub class:-

Super class
vehicle

truck scooter
Bus car

Sub class
Specialization
Is the process of defining a set of subclasses of a super class.
The set of subclasses is based upon some distinguishing
characteristics of the entities in the super class.
Example: {SECRETARY, ENGINEER, TECHNICIAN} is
a specialization of EMPLOYEE based upon job type.
 May have several specializations of the same super class.
Example: Another specialization of EMPLOYEE based in
method of pay is {SALARIED_EMPLOYEE,
HOURLY_EMPLOYEE}.
Super class/subclass relationships and specialization can be
diagrammatically represented in EER diagrams.
sid
student
name
is A

Undergrad graduate

Attributes of a subclass are called specific attributes.


For example, Typing Speed of SECRETARY
The subclass can participate in specific relationship types.
For example, BELONGS_TO of HOURLY_EMPLOYEE
Example of a Specialization
Generalization
The reverse of the specialization process.

Several classes with common features are generalized into a


super class; original classes become its subclasses.

Example: CAR, TRUCK generalized into VEHICLE.

Both CAR, TRUCK become subclasses of the super class


VEHICLE.

 We can view {CAR, TRUCK} as a specialization of


VEHICLE
 Alternatively, we can view VEHICLE as a generalization
of CAR and TRUCK
Generalization
 A bottom-up design process – combine a number of entity
sets that share the same features into a higher-level entity
set.

 Specialization and generalization are simple inversions of


each other; they are represented in an E-R diagram in the
same way.

 The terms specialization and generalization are used


interchangeably.
Generalization and Specialization
 Arrow pointing to the generalized super class represents a
generalization .
 Arrows pointing to the specialized subclasses represent a
specialization.
 We advocate not drawing any arrows in these situations.
 A super class or subclass represents a set of entities.
 Shown in rectangles in EER diagrams (as are entity types).
 Sometimes, all entity sets are simply called classes, whether
they are entity types, super classes, or subclasses
GENERALIZATION AND SPECIALIZATION

sid
student
name
is A
Generalization Specialization

Undergrad graduate
Generalization Specialization

Generalization Specialization
Aggregation
The E-R model cannot express relationships among relationships.
When would we need such a thing?
Consider a DB with information about employees who work on a
particular project and use a number of machines doing that work. We get
the E-R diagram shown in Figure
Hours Member
Name Id

Employee Project
Work

Users

Figure E-R diagram with


Machinery redundant relationships

Id
Relationship sets work and uses could be combined into a
single set. However, they shouldn't be, as this would obscure
the logical structure of this scheme

The solution is to use aggregation

An abstraction through which relationships are treated as


higher-level entities.

For our example, we treat the relationship set work and the
entity sets employee and project as a higher-level entity set
called work.
Name Id Hours Member

Work Project
Employee Work

Users

Machinery

Id

Figure shows the E-R diagram with aggregation.


Summary of Symbols Used in
E-R Notation
E-R Diagram for a Banking Enterprise
The Relational Model
Data and relationships are represented by a collection of tables
Each table has a number of columns with unique names, e.g.
customer, account
Figure shows a sample relational database

Fig:-A sample relational database.


Example of a Relation
attributes
(or columns)

tuples
(or rows)
Codd's Rules
Codd's twelve rules are a set of
thirteen rules (numbered zero to
twelve) proposed by Edgar F. Codd, a
pioneer of the relational model.
For databases, designed to define what
is required from a database
management system in order for it to
be considered relational, i.e., a
relational database management
system (RDBMS).
0.Foundation Rule
 A relational database management system must manage its stored
data using only its relational capabilities.

1. Information Rule
 All information in the database should be represented in one and
only one way - as values in a table.

2.Guaranteed Access Rule


 Each and every datum (atomic value) is guaranteed to be logically
accessible by resorting to a combination of table name, primary key
value and column name.
3. Systematic Treatment of Null Values

 Null values (distinct from empty character string or a string of


blank characters and distinct from zero or any other number) are
supported in the fully relational DBMS for representing missing
information in a systematic way, independent of data type.

4. Dynamic On-line Catalog Based on the Relational Model

 The database description is represented at the logical level in the


same way as ordinary data,
 so authorized users can apply the same relational language to its
interrogation as they apply to regular data.
5. Comprehensive Data Sublanguage Rule
 A relational system may support several languages and various
modes of terminal use.
 However, there must be at least one language whose statements are
expressible, per some well-defined syntax, as character strings and
whose ability to support all of the following is comprehensible:
view definition
data definition
data manipulation (interactive and by program)
integrity constraints
authorization
transaction boundaries (begin, commit, and rollback).
6. View Updating Rule
 All views that are theoretically updateable are also updateable by
the system
7. High-level Insert, Update, and Delete

 The capability of handling a base relation or a derived relation as a single operand


applies nor only to the retrieval of data but also to the insertion, update, and
deletion of data.

8. Physical Data Independence


 Application programs and terminal activities remain logically unimpaired
whenever any changes are made in either storage representation or access
methods.

9.Logical Data Independence


 Application programs and terminal activities remain logically unimpaired when
information preserving changes of any kind that theoretically permit unimpairment
are made to the base tables.
10. Integrity Independence
 Integrity constraints specific to a particular relational database
must be definable in the relational data sublanguage and storable
in the catalog, not in the application programs.

11. Distribution Independence


 The data manipulation sublanguage of a relational DBMS must
enable application programs and terminal activities to remain
logically unimpaired whether and whenever data are physically
centralized or distributed.

12.Nonsubversion Rule
 If a relational system has or supports a low-level (single-record-at-
a-time) language, that low-level language cannot be used to subvert
or bypass the integrity rules or constraints expressed in the higher-
level (multiple-records-at-a-time) relational language
The Network Model

Data are represented by collections of records

Relationships among data are represented by links

Organization is that of an arbitrary graph.

The network model is a database model conceived as a


flexible way of representing objects and their relationships
Figure shows a sample network database that is the equivalent
of the relational database

Relational Model

Network Model
The Hierarchical Model
Similar to the network model
A hierarchical data model is a data model in which the data
is organized into a tree-like structure.
Organization of the records is as a collection of trees, rather
than arbitrary graphs
The structure allows representing information using
parent/child relationships: each parent can have many children
but each child only has one parent (also known as a 1:many
ratio ).
In a database, an entity type is the equivalent of a table
each individual record is represented as a row and
an attribute as a column.
A sample hierarchical database
Data Independence
The ability to modify a scheme definition in one level without
affecting a scheme definition in a higher level is called data
independence.

There are two kinds:

Physical data independence


The ability to modify the physical scheme without causing
application programs to be rewritten.
This approach has to do with altering the organization or
storage procedures related to the data, rather than modifying
the data itself.
Logical data independence

The ability to modify the conceptual scheme without causing


application programs to be rewritten – Usually done when
logical structure of database is altered.

Logical data independence makes it possible to change the


structure of the data independently of modifying the
applications or programs that make use of the data.

There is no need to rewrite current applications as part of the


process of adding to or removing data from then system.
Database Manager
The database manager is a program module which provides the
interface between the low-level data stored in the database and
the application programs and queries submitted to the system .
The goal of the database system is to simplify and facilitate
access to data.

The database manager module is responsible for:-


 Interaction with the file manager:-
 Storing raw data on disk using the file system usually provided
by a conventional operating system.

 The database manager must translate DML statements into low-


level file system commands (for storing, retrieving and updating
data in the database).
 Integrity enforcement:-
 Checking that updates in the database do not violate consistency
constraints (e.g. no bank account balance below $25).

 Security enforcement:-
 Ensuring that users only have access to information they are
permitted to see.

 Backup and recovery:-


 Detecting failures due to power failure, disk crash, software
errors, etc., and restoring the database to its state before the
failure.

 Concurrency control:-
 Preserving data consistency when there are concurrent users.
Database Administrator
The database administrator is a person having central control
over data and programs accessing that data.

Duties of the database administrator include:-

Scheme definition:-
The creation of the original database scheme.
This involves writing a set of definitions in a DDL , compiled
by the DDL compiler into a set of tables stored in the data
dictionary.

Storage structure and access method definition:-


writing a set of definitions translated by the data storage and
definition language compiler.
Scheme and physical organization modification: -
writing a set of definitions used by the DDL compiler to
generate modifications to appropriate internal system tables
(e.g. data dictionary).
This is done rarely, but sometimes the database scheme or
physical organization must be modified.

Granting of authorization for data access:-


Granting different types of authorization for data access to
various users .

Integrity constraint specification:-


Generating integrity constraints.
These are consulted by the database manager module
whenever updates occur.
Database Users
The database users fall into several categories: -
 Application programmers :-

 Are computer professionals interacting with the system


through DML calls embedded in a program written in a host
language (e.g. C, Pascal).

 Sophisticated users:-

 Interact with the system without writing programs.


 They form requests by writing queries in a database query
language.
 These are submitted to a query processor that breaks a DML
statement down into instructions for the database manager
module.
 Specialized users:-

 Are sophisticated users writing special database application


programs.
 These may be CAD systems, knowledge-based and expert
systems, complex data systems (audio/video), etc.

 Naive users:-

 Are unsophisticated users who interact with the system by


using permanent application programs (e.g. automated teller
machine).
Overall System Structure
Database systems are partitioned into modules for different
functions.
Some functions (e.g. file systems) may be provided by the
operating system.

Components include:-

File manager :-
manages allocation of disk space and data structures used to
represent information on disk.

Database manager :-
The interface between low-level data and application
programs and queries.
Query processor :-
Translates statements in a query language into low-level
instructions the database manager understands.

DML precompiler :-
Converts DML statements embedded in an application
program to normal procedure calls in a host language.
The precompiler interacts with the query processor.

DDL compiler :-
Converts DDL statements to a set of tables containing
metadata stored in a data dictionary.
In addition, several data structures are required for physical
system implementation :-

Data files :-
Store the database itself.

Data dictionary :-
Stores information about the structure of the database. It is
used heavily.
Great emphasis should be placed on developing a good design
and efficient implementation of the dictionary.

 Indices :-
 Provide fast access to data items holding particular values.
Overall System Structure
Schema Diagram
A database schema ,along with primary key and foreign key
dependencies, can be depicted pictorially by schema diagram.

In schema diagram each relation appears as a box, with


attributes listed inside it and relation name above it.

Foreign key dependencies appear as arrow from the foreign


key attributes of the referencing relation to the primary key of
the referenced relation.
Schema Diagram for the Banking Enterprise
Relation name

Primary key
Foreign key
Relation

Attribute
Schema Diagram for University Database
Integrity
Integrity constraints are used to ensure accuracy and
consistency of data in a relational database.
Data integrity is handled in a relational database through the
concept of referential integrity.

There are many types of integrity constraints that play a role in


referential integrity-
Entity integrity :-
The entity integrity constraint states that no primary key value
can be null.
This is because the primary key value is used to identify
individual tuples in a relation.
Having null value for the primary key implies that we cannot
identify some tuples.
This also specifies that there may not be any duplicate entries
in primary key column.
 Referential Integrity :-
The referential integrity constraint is specified between two relations
and is used to maintain the consistency among tuples in the two
relations.

The referential integrity constraint states that a tuple in one relation


that refers to another relation must refer to an existing tuple in that
relation.

Referential integrity is with the relationships between the tables of a


database, ie that the data of one table does not contradict the data of
another table.

Specifically, every foreign key value in a table must have a matching


primary key value in the related table.

This is the most common type of integrity constraint.

This is used to manage the relationships between primary and foreign


keys.
Domain Integrity :-

A domain is defined as the set of all unique values permitted for


an attribute.
For example, a domain of Date is the set of all possible valid
dates.
A domain of Integer is all possible whole numbers.
A domain of day-of-week is Monday, Tuesday ... Sunday.

The domain integrity states that every element from a relation


should respect the type and restrictions of its corresponding
attribute.
Restrictions could be the range of values that the element can
have, the default value if none is provided, and if the element can
be NULL.
Column Constraints :-

During the data analysis phase, business rules will identify any
column constraints.
For example, a salary cannot be negative.
an employee number must be in the range 1000 - 2000, etc.

User-Defined Integrity Constraints :-

Business rules may dictate that when a specific action occurs,


further actions should be triggered.
For example, deletion of a record automatically writes that record
to an audit table.
Converting E-R diagram into tab
les
University Question Based on Unit-I
1) Compare various data models. [10 marks]

2) Explain in details the different levels of data abstraction. [4 marks]

3) Compare DBMS and file processing system with following points. [4 marks]
 i) Redundancy
 Ii) Access control

4) What is difference between specialization and generalization ? Why do we not display this
difference in schema diagram [6 marks]

5) Specify the CODD’s norms to be specified by RDBMS [6 marks]

6) What are the enhancement that distinguish the EER model from the ER model ? Explain with
example [6 marks]

You might also like