0% found this document useful (0 votes)
12 views90 pages

Unit 1

The document provides an introduction to Database Management Systems (DBMS), outlining their purpose, advantages, and disadvantages. It discusses the structure of DBMS, including levels of abstraction, components, and types of users, as well as various data models such as hierarchical, network, and relational models. The document emphasizes the importance of DBMS in managing data efficiently and addressing issues like redundancy, inconsistency, and security.

Uploaded by

moumitashopping0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views90 pages

Unit 1

The document provides an introduction to Database Management Systems (DBMS), outlining their purpose, advantages, and disadvantages. It discusses the structure of DBMS, including levels of abstraction, components, and types of users, as well as various data models such as hierarchical, network, and relational models. The document emphasizes the importance of DBMS in managing data efficiently and addressing issues like redundancy, inconsistency, and security.

Uploaded by

moumitashopping0
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 90

Unit 1: Introduction to DBMS

By :
Mrs. Suman Madan
[email protected]
Database Management System (DBMS)
 Collection of interrelated data

 Set of programs to access the data

 DBMS contains information about a particular enterprise

 DBMS provides an environment that is both convenient and efficient to


use.

 Database Applications:
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions

 Databases touch all aspects of our lives


Purpose of Database System
In the early days, database applications were
built on top of file systems.
Drawbacks of using file systems to store data:
 Data redundancy and inconsistency - duplication of
information in different files.
• Uncontrolled duplication of data is undesirable for
following reasons:
 Duplication costs time and money to enter data more than
once.
 It takes additional storage space thus again increasing
associated costs. It can be avoided by sharing data files.
 It may lead to data inconsistency.
 Difficulty in accessing data
 Need to write a new program to carry out each new
task.
Drawbacks of file systems (cont.)
• Data isolation — multiple files and formats.
• When data is isolated in separate files, it is more difficult to
access data and to ensure that data is correct.
• Also, the structure of file depends on application
programming language. Thus the direct incompatibility of
such files makes it difficult to process jointly.

• Integrity problems
 Integrity constraints (e.g. account balance > 0) become part
of program code
 Hard to add new constraints or change existing ones.
Drawbacks of file systems (cont.)
• Atomicity of updates
 Failures may leave database in an inconsistent state with
partial updates carried out.
 E.g. transfer of funds from one account to another should
either complete or not happen at all.

• Concurrent access by multiple users


 Concurrent accessed needed for performance.
 Uncontrolled concurrent accesses can lead to inconsistencies
 E.g. two people reading a balance and updating it at the same
time.
• Security problems
Database systems offer solutions to all the above
problems
Levels of Abstraction
 Physical level/Internal level : It describes how a
record (e.g., customer) is stored.
 Logical level/Conceptual level: It describes data
stored in database, and the relationships among
the data.
type customer = record
name : string;
street : string;
city : integer;
end;
 View level/ External Level: Application programs
hide details of data types. Views can also hide
information (e.g., salary) for security purposes.
Views of Data

External
view: User &
Data
Designer

Conceptual
Schema:
Data
Designer

Physical
Storage:
DBA
An architecture for a database system
Views of Data
 Physical level/Internal level : The physical
representation of the database on the computer.
This level describes how the data is stored in the
database.
• It includes :
 Where the data is located
 File structures
 Access methods
 Indexes.

The physical schema is managed by the DBA.


Views of Data
 Logical level/Conceptual level: The community view of
the database. This level describes what data is stored
in the database and the relationships among the data.
• What are the entities and Relationships in
organization.
• What information these entities and relationships
should store in database.
• What integrity constraints/business rules it should
have?
• It consists of the schemas we have described with
CREATE TABLE statements.
Views of Data
 View level/ External Level: The users view of the
database. This level describes that part of the
database that is relevant to each user.
• Each external schema is a combination of base
tables and views, tailored to the needs of a single
user.
• It is managed by the data designer and the user.
Advantages of DBMS

 Control of data redundancy


 Data consistency
 More information from the same amount of data
 Sharing of data
 Improved data integrity
 Improved security
 Enforcement of standards
 Economy of scale
 Balance of conflicting requirements
 Improved data accessibility and responsiveness
 Increased productivity
 Improved maintenance through data independence
Disadvantages of DBMS
 Complexity – provision of the functionality we expect from
DBMS makes it extremely complex.
 Size – complexity and breadth of functionality makes DBMS
extremely large piece of software.
 Cost of DBMSs – it varies significantly depending on the
environment & functionality provided.
 Additional hardware costs – to achieve required
performance, it is necessary to procure large memory.
 Performance – DBMS is written to be more general, to cater
for many applications rather than just one.
 Higher impact of failure – Centralization of resources
increases the vulnerability of the system. Since all users &
applications rely on DBMS, the failure of one component
can bring operations to a halt.
Components of the DBMS
1. Hardware – DBMS and the applications require hardware to
run.
• It can range from PC to mainframe or network of
computers.
• It depends on the organization’s requirements and the
DBMS used.
2. Software – It comprises of following :
• DBMS software itself
• Application program
• Operating System including network software
3. Data - Most important component from end-user’s point
of view.
• It acts as a bridge between the machine components
and the human component.
• The database contains both operational data and the
metadata.
Components of the DBMS
4. Procedures – It refers to the instructions and rules that
govern the design and use of the database.
• Users of the system require documented procedures
on how to use/run the system.
• It may consists of instructions like
• Log on to the DBMS
• Use a particular DBMS facility or application
program.
• Start and stop DBMS
• Make backup copies of the database
• Handle H/W and S/W failures.
• Change structure of table to improve performance

5. People – i.e USERS


Database Administrator
 Coordinates all the activities of the database system;
the database administrator has a good understanding
of the enterprise’s information resources and needs:
 Database administrator’s duties include:
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting user authority to access the database
• Specifying integrity constraints
• Acting as liaison with users
• Monitoring performance and responding to changes in
requirements
Database Users
 Users are differentiated by the way they expect to interact with the
system.

1. Application programmers: They are the developers who interact


with the database by means of DML queries. These DML queries
are written in the application programs like C, C++, JAVA, Pascal
etc. These queries are converted into object code to communicate
with the database.

For example, writing a C program to generate the report of


employees who are working in particular department will involve a
query to fetch the data from database. It will include a embedded SQL
query in the C Program.
Database Users
2. Sophisticated users : They are database developers, who write SQL
queries to select/insert/delete/update data. They do not use any
application or programs to request the database. They directly
interact with the database by means of query language like SQL.

These users will be scientists, engineers, analysts who thoroughly


study SQL and DBMS to apply the concepts in their requirement. In
short, we can say this category includes designers and developers of
DBMS and SQL.

3. Specialized users: These are also sophisticated users, but they write
special database application programs. They are the developers who
develop the complex programs to the requirement.
Database Users
4. Stand-alone Users - These users will have stand –alone
database for their personal use. These kinds of database will
have readymade database packages which will have menus and
graphical interfaces.

5. Naive users: These are the users who use the existing
application to interact with the database. For example, online
library system, ticket booking systems, ATMs etc which has
existing application and users use them to interact with the
database to fulfill their requests.
Overall System Structure

indices Statistical data disk storage

Data files Data dictionary


Data Models
 An integrated collection of concepts for describing and manipulating the
following in an organization:
• Data
• Data relationships
• Data semantics
• Data constraints

 Facilitate interaction among the designer, the applications programmer, and the
end user

 A data model can be thought of as comprising 3 components:

• Structural Part – consisting of set of rules according to which databases can


be structured.

• Manipulative part – defining the types of operations that are allowed on the
data (includes insertion, retrieval or updating)

• Set of integrity rules – for ensuring that data is accurate.


Data Model Categories

 Object-based data models


◦ Entity-relationship model
◦ Object-oriented model
◦ Semantic model
◦ Functional model

 Record-based logical models


◦ Relational model (e.g., SQL/DS, DB2)
◦ Network model
◦ Hierarchical model (e.g., IMS)

 Physical Data Models


◦ Unifying model
◦ Frame Memory
Hierarchical Model
 A hierarchical data model is a data model in which the data is organized
into a tree-like structure.

 The structure allows repeating information using parent/child relationships:


each parent can have many children but each child only has one parent.

 All attributes of a specific record are listed under an entity type.


Hierarchical Model

• ADVANTAGES:

1. Simplicity : Data naturally have hierarchical relationship in most of the practical situations.
Therefore, it is easier to view data arranged in manner. This makes this type of database more
suitable for the purpose.

2. Security :These database system can enforce varying degree of security feature unlike flat-file
system.

3. Database Integrity: Because of its inherent parent-child structure, database integrity is highly
promoted in these systems.

4. Efficiency: The hierarchical database model is a very efficient, one when the database contains a large
number of I: N relationships (one-to-many relationships) and when the users require large number of
transactions, using data whose relationships are fixed.
• DISADVANTAGES:
1. Complexity of Implementation: The actual implementation of a hierarchical database depends on the
physical storage of data. This makes the implementation complicated.
2. Difficulty in Management: The movement of a data segment from one location to another cause all the
accessing programs to be modified making database management a complex affair.
3. Complexity of Programming: Programming a hierarchical database is relatively complex because the
programmers must know the physical path of the data items.
4. Poor Portability: The database is not easily portable mainly because there is little or no standard existing for
these types of database.
5. Database Management Problems: If you make any changes in the database structure of a hierarchical
database, then you need to make the necessary changes in all the application programs that access the
database. Thus, maintaining the database and the applications can become very difficult.
6. Lack of structural independence: Structural independence exists when the changes to the database structure
does not affect the DBMS's ability to access data. Hierarchical database systems use physical storage paths to
navigate to the different data segments. So, the application programs should have a good knowledge of the
relevant access paths to access the data. So, if the physical structure is changed the applications will also have
to be modified. Thus, in a hierarchical database the benefits of data independence are limited by structural
dependence.
7. Programs Complexity: Due to the structural dependence and the navigational structure, the application
programs and the end users must know precisely how the data is distributed physically in the database in
order to access data. This requires knowledge of complex pointer systems, which is often beyond the grasp of
ordinary users (users who have little or no programming knowledge).
8. Operational Anomalies: Hierarchical model suffers from the Insert anomalies, Update anomalies and
Deletion anomalies, also the retrieval operation is complex and asymmetric, thus hierarchical model is not
suitable for all the cases.
9. Implementation Limitation: Many of the common relationships do not conform to the l:N format required by
the hierarchical model. The many-to-many (N:N) relationships, which are more common in real life are very
difficult to implement in a hierarchical model.
Network Model
 The network model is a database model conceived as a flexible way of representing objects and their
relationships.

 Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and
relationship types are arcs, is not restricted to being a hierarchy or lattice.

 The network model replaces the hierarchical model with a graph thus allowing more general
connections among the nodes. The main difference of the network model from the hierarchical model
is its ability to handle many to many relationships. In other words it allow a record to have more than
one parent.
Network Model
• ADVANTAGES:
1.) Conceptual simplicity-Just like the hierarchical model,the network model is also
conceptually simple and easy to design.
2.) Capability to handle more relationship types-The network model can handle the one to
many and many to many relationships which is real help in modeling the real life situations.
3.) Ease of data access-The data access is easier and flexible than the hierarchical model.
4.) Data integrity- The network model does not allow a member to exist without an owner.
5.) Data independence- The network model is better than the hierarchical model in isolating the
programs from the complex physical storage details.
6.) Database standards

• DISADVANTAGES:
1.) System complexity- All the records are maintained using pointers and hence the whole
database structure becomes very complex.
2.) Operational Anomalies- The insertion, deletion and updating operations of any record
require large number of pointers adjustments.
3.) Absence of structural independence-structural changes to the database is very difficult.
Relational Model
Attributes
 Example of tabular data in the relational model
Customer- customer- customer- customer- account-
id name street city number
192-83-7465 Johnson Alma Palo Alto A-101
019-28-3746 Smith North Rye A-215
192-83-7465 Johnson Alma Palo Alto A-201
321-12-3123 Jones Main Harrison A-217
019-28-3746 Smith North Rye A-201
A Sample Relational Database
Schemas and Instances
 Instances and schemas are similar to types and variables in
programming languages.

 Schema – the logical structure of the database. This means overall


description of the database.
• e.g., the database consists of information about a set of customers and
accounts and the relationship between them.

• Analogous to type information of a variable in a program.

• Physical schema: database design at the physical level.

• Logical schema: database design at the logical level.

• The schema is specified during the database design process and is not
expected to change frequently.
Schemas and Instances
 Instance – the actual content of the database i.e. actual data at a particular
point in time .
• Analogous to the value of a variable.
• The actual data in the database may change very frequently.

 Thus many database instances can correspond to the same database


schema.

 The schema is sometimes called as intension of the database.

 The instance is sometimes called as extension or state of the database.


Three-Schema Architecture
• Defines DBMS schemas at three levels:

• Internal schema at the internal level to describe physical


storage structures and access paths. Typically uses a
physical data model.

• Conceptual schema at the conceptual level to describe the


structure and constraints for the whole database for a
community of users. Uses a conceptual or an
implementation data model.

• External schemas at the external level to describe the


various user views. Usually uses the same data model as the
conceptual level.
Three-Schema Architecture
 Mappings among schema levels are needed to
transform requests and data. Programs refer to
an external schema, and are mapped by the
DBMS to the internal schema for execution.

 Mappings between internal and conceptual


schemas is known I/C mapping.

 Mappings between external and conceptual


schemas is known E/C mapping.
Data Independence
 The disjointing of data descriptions from the application programs (or user-
interfaces) that uses the data is called data independence.

 Data independence is one of the main advantages of DBMS.

 The three-schema architecture provides the concept of data independence,


which means that upper-levels are unaffected by changes to lower-levels.

 The interfaces between the various levels and components should be well
defined so that changes in some parts do not seriously influence others.

 There are two kinds of data independence.

• Physical Data Independence


• Logical Data Independence
Types of Data Independence
 Physical Data Independence – the ability to modify the physical
schema without changing the logical schema

• Applications depend on the logical schema


• In general, the interfaces between the various levels and
components should be well defined so that changes in some parts
do not seriously influence others.

 Logical Data Independence – the ability to modify the conceptual


schema without changing the application program

• Usually done when logical structure of database is altered.


• Logical data independence is harder to achieve as the application
programs are usually heavily dependent on the logical structure of
the data. An analogy is made to abstract data types in
programming languages.
Entity-Relationship Model
 An entity-relationship (ER) diagram is a specialized
graphic that illustrates the interrelationships
between entities in a database.
 ER diagrams often use symbols to represent three
different types of information, namely, entities,
attributes and relationships.
How do we start ERD
 Define Entities: These are usually nouns used in
descriptions of the system, in the discussion of business
rules, or in documentation; identified in the narrative.

 Define Relationships: These are usually verbs used in


descriptions of the system or in discussion of the
business rules (entity ______ entity); identified in the
narrative.

 Add attributes to the relations: These are determined by


the queries, and may also suggest new entities, e.g.
grade; or they may suggest the need for keys or
identifiers.
ER Diagram Symbols

 Entity
 An entity is an object or concept about which you want to store
information.

 Key attribute
 A key attribute is the unique, distinguishing characteristic of the
entity. For example, an employee's social security number might
be the employee's key attribute.
ER Diagram Symbols

 Relationships
 Relationships illustrate how two entities share information in the
database structure.

 Weak Entity
 A weak entity is an entity that must defined by a foreign key
relationship with another entity as it cannot be uniquely identified
by its own attributes alone.
ER Diagram Symbols

 Derived attribute
 A derived attribute is based on another attribute. For example, an
employee's monthly salary is based on the employee's annual
salary.

 Multivalued attribute
 A multivalued attribute can have more than one value. For
example, an employee entity can have multiple skill values.
How to express Relationships
 Cardinality specifies how many instances of an entity relate to one instance
of another entity.

 Ordinality is also closely linked to cardinality.

 While cardinality specifies the occurrences of a relationship, ordinality


describes the relationship as either mandatory or optional.

 In other words, cardinality specifies the maximum number of relationships


and ordinality specifies the absolute minimum number of relationships.

 When the minimum number is zero, the relationship is usually called


optional and when the minimum number is one or more, the relationship is
usually called mandatory.
Relationship Symbols

a
Entity-Relationship Model

Example of schema in the entity-


relationship model
Entity Relationship Model (Cont.)

 E-R model of real world


◦ Entities (objects)
 E.g. customers, accounts, bank branch
◦ Relationships between entities
 E.g. Account A-101 is held by customer Johnson
 Relationship set depositor associates customers with
accounts
 Widely used for database design
◦ Database design in E-R model usually converted to
design in the relational model (coming up next) which is
used for storage and processing
Data Definition Language (DDL)
 Specification notation for defining the database
schema
◦ E.g.
create table account (
account-number char(10),
balance integer)
 DDL compiler generates a set of tables stored in
a data dictionary
 Data dictionary contains metadata (i.e., data
about data)
◦ database schema
◦ Data storage and definition language
 language in which the storage structure and access
methods used by the database system are specified
 Usually an extension of the data definition language
Data Manipulation Language (DML)

 Language for accessing and manipulating the data


organized by the appropriate data model
◦ DML also known as query language
 Two classes of languages
◦ Procedural – user specifies what data is required and how
to get those data
◦ Nonprocedural – user specifies what data is required
without specifying how to get those data
 SQL is the most widely used query language
Entity Sets
 A database can be modeled as:
◦ a collection of entities,
◦ relationship among entities.

 An entity is an object that exists and is


distinguishable from other objects.
◦ Example: specific person, company, event, plant

 Entities have attributes


◦ Example: people have names and addresses

 An entity set is a set of entities of the same type


that share the same properties.
◦ Example: set of all persons, companies, trees, holidays
Entity Sets ‘customer ‘ and ‘loan’
customer-id customer- customer- customer- loan- amount
name street city number
Attributes
 An entity is represented by a set of attributes,
that is descriptive properties possessed by all
members of an entity set.
Example:
customer = (customer-id, customer-name,
customer-street, customer-city)
loan = (loan-number, amount)
Domain – the set of permitted values for each
attribute
 Attribute types:
◦ Simple and composite attributes.
◦ Single-valued and multi-valued attributes
 E.g. multivalued attribute: phone-numbers
◦ Derived attributes
 Can be computed from other attributes
 E.g. age, given date of birth
Composite Attributes
Degree of a Relationship Set
 Refers to number of entity sets that participate in
a relationship set.
 Relationship sets that involve two entity sets are
binary (or degree two). Generally, most
relationship sets in a database system are binary.

 Relationship sets may involve more than two


entity sets.
H E.g. Suppose employees of a bank may have jobs (responsibilities) at
multiple branches, with different jobs at different branches. Then there
is a ternary relationship set between entity sets employee, job and
branch

 Relationships between more than two entity sets


are rare. Most relationships are binary.
Mapping Cardinalities
 Express the number of entities to which another
entity can be associated via a relationship set.

 Most useful in describing binary relationship sets.

 For a binary relationship set the mapping


cardinality must be one of the following types:
◦ One to one
◦ One to many
◦ Many to one
◦ Many to many
Mapping Cardinalities

One to one One to many


Note: Some elements in A and B may not be mapped to any
elements in the other set
Mapping Cardinalities

Many to one Many to many


Note: Some elements in A and B may not be mapped to any
elements in the other set
Mapping Cardinalities affect ER Design
n Can make access-date an attribute of account, instead of a
relationship attribute, if each account can have only one customer
n I.e., the relationship from account to customer is many to one,
or equivalently, customer to account is one to many
E-R Diagrams

n Rectangles represent entity sets.


n Diamonds represent relationship sets.
n Lines link attributes to entity sets and entity sets to relationship sets.
n Ellipses represent attributes
n Double ellipses represent multivalued attributes.
n Dashed ellipses denote derived attributes.
n Underline indicates primary key attributes (will study later)
E-R Diagram With Composite, Multivalued,
and Derived Attributes
Relationship Sets with Attributes
Cardinality Constraints
 We express cardinality constraints by drawing
either a directed line (), signifying “one,” or an
undirected line (—), signifying “many,” between
the relationship set and the entity set.
 E.g.: One-to-one relationship:
◦ A customer is associated with at most one loan via the
relationship borrower
◦ A loan is associated with at most one customer via
borrower
One-To-Many Relationship
 In the one-to-many relationship a loan is
associated with at most one customer via
borrower, a customer is associated with several
(including 0) loans via borrower
Many-To-One Relationships
 In a many-to-one relationship a loan is
associated with several customers via borrower,
a customer is associated with at most one loan
via borrower
Many-To-Many Relationship

 A customer is associated with several (possibly


0) loans via borrower
 A loan is associated with several customers via
borrower
Alternative Notation for
Cardinality Limits
n Cardinality limits can also express participation constraints
Keys
 A super key of an entity set is a set of one or more
attributes whose values uniquely determine each
entity.

 A candidate key of an entity set is a minimal super


key
◦ Customer-id is candidate key of customer
◦ account-number is candidate key of account

 Although several candidate keys may exist, one of


the candidate keys is selected to be the primary
key.
Design Issues
 Use of entity sets vs. attributes
 Choice mainly depends on the structure of the enterprise being
modeled, and on the semantics associated with the attribute in
question.
 Use of entity sets vs. relationship sets
 Possible guideline is to designate a relationship set to describe
an action that occurs between entities
 Binary versus n-ary relationship sets
 Although it is possible to replace any nonbinary (n-ary, for n >
2) relationship set by a number of distinct binary relationship
sets, a n-ary relationship set shows more clearly that several
entities participate in a single relationship.
 Placement of relationship attributes
Weak Entity Sets

 An entity set that does not have a primary key is


referred to as a weak entity set.

 The existence of a weak entity set depends on the


existence of a identifying entity set
◦ it must relate to the identifying entity set via a total, one-
to-many relationship set from the identifying to the weak
entity set
◦ Identifying relationship depicted using a double diamond
Weak Entity Sets

 The discriminator (or partial key) of a weak entity


set is the set of attributes that distinguishes
among all the entities of a weak entity set.

 The primary key of a weak entity set is formed by


the primary key of the strong entity set on which
the weak entity set is existence dependent, plus
the weak entity set’s discriminator.
Weak Entity Sets
 We depict a weak entity set by double rectangles.
 We underline the discriminator of a weak entity set
with a dashed line.
 Primary key for payment – (loan-number, payment-
number)
Weak Entity Sets (Cont..)
 Note: the primary key of the strong entity set is
not explicitly stored with the weak entity set,
since it is implicit in the identifying relationship.

 If loan-number were explicitly stored, payment


could be made a strong entity, but then the
relationship between payment and loan would
be duplicated by an implicit relationship defined
by the attribute loan-number common to
payment and loan
More Weak Entity Set Examples
 In a university, a course is a strong entity and a
course-offering can be modeled as a weak entity
 The discriminator of course-offering would be
semester (including year) and section-number (if
there is more than one section)
 If we model course-offering as a strong entity we
would model course-number as an attribute.
Then the relationship with course would be implicit
in the course-number attribute
Specialization
 Top-down design process; we designate sub groupings
within an entity set that are distinctive from other entities
in the set.

 These sub groupings become lower-level entity sets that


have attributes or participate in relationships that do not
apply to the higher-level entity set.

 Depicted by a triangle component labeled IS A (E.g.


customer “is a” person).

 Attribute inheritance – a lower-level entity set inherits all


the attributes and relationship participation of the higher-
level entity set to which it is linked.
Specialization Example
Generalization
 A bottom-up design process – combine a
number of entity sets that share the same
features into a higher-level entity set.

 Specialization and generalization are simple


inversions of each other; they are represented in
an E-R diagram in the same way.

 The terms specialization and generalization are


used interchangeably.
Generalization (cont.)
NoOfPassengers LicensePlateNo NoOfAxies LicensePlateNo

MaxSpeed Tonnage
CAR Price TRUCK Price
VehicleID VehicleID

VehicleID Price LicensePlateNo

VEHICLE

NoOfPassengers d NoOfAxies

MaxSpeed
CAR TRUCK Tonnage
Generalization (Cont..)

 Generalization suppresses the difference among


several entity types, identifying their common
features, and generalize them into a single
superclass of which the original types are special
subclasses.

 The decision as to which process, generalization


or specialization, is more appropriate in a
particular situation is often subjective.
Specialization and Generalization
 Can have multiple specializations of an entity set based on
different features.

 E.g. permanent-employee vs. temporary-employee, in


addition to officer Vs secretary Vs teller

 Each particular employee would be


◦ a member of one of permanent-employee or temporary-
employee,
◦ and also a member of one of officer, secretary, or teller

 The IS A relationship also referred to as superclass - subclass


relationship
Design Constraints on a
Specialization/Generalization
 Constraint on which entities can be members of a
given lower-level entity set.
◦ condition-defined
 E.g. all customers over 65 years are members of senior-citizen
entity set; senior-citizen ISA person.
◦ user-defined
 Constraint on whether or not entities may belong to
more than one lower-level entity set within a single
generalization.
◦ Disjoint
 an entity can belong to only one lower-level entity set
 Noted in E-R diagram by writing disjoint next to the ISA
triangle
◦ Overlapping
 an entity can belong to more than one lower-level entity set
E-R Design Decisions
 The use of an attribute or entity set to represent an
object.
 Whether a real-world concept is best expressed by an
entity set or a relationship set.
 The use of a ternary relationship versus a pair of binary
relationships.
 The use of a strong or weak entity set.
 The use of specialization/generalization – contributes to
modularity in the design.
 The use of aggregation – can treat the aggregate entity
set as a single unit without concern for the details of its
internal structure.
E-R Diagram for a Banking Enterprise
Summary of Symbols Used in E-R
Notation
Reduction of an E-R Schema to Tables
 Primary keys allow entity sets and relationship sets to be
expressed uniformly as tables which represent the contents
of the database.
 A database which conforms to an E-R diagram can be
represented by a collection of tables.
 For each entity set and relationship set there is a unique table
which is assigned the name of the corresponding entity set or
relationship set.
 Each table has a number of columns (generally corresponding
to attributes), which have unique names.
 Converting an E-R diagram to a table format is the basis for
deriving a relational database design from an E-R diagram.
Representing Entity Sets as Tables
 A strong entity set reduces to a table with the
same attributes.
Composite Attributes

 Composite attributes are flattened out by creating


a separate attribute for each component attribute

◦ E.g. given entity set customer with composite attribute


name with component attributes first-name and last-
name the table corresponding to the entity set has two
attributes :
name.first-name and
name.last-name
Multivalued Attributes
 A multivalued attribute M of an entity E is
represented by a separate table EM

◦ Table EM has attributes corresponding to the primary key


of E and an attribute corresponding to multivalued
attribute M

◦ E.g. Multivalued attribute dependent-names of employee


is represented by a table
employee-dependent-names( employee-id, dname)

◦ Each value of the multivalued attribute maps to a


separate row of the table EM
 E.g., an employee entity with primary key Charles and
dependents Johnson and Jumi maps to two rows:
(John, Johnson) and (John, Jumi)
Representing Weak Entity Sets

n A weak entity set becomes a table that includes a


column for the primary key of the identifying strong
entity set
Representing Relationship Sets as

Tables
A many-to-many relationship set is represented as a table with
columns for the primary keys of the two participating entity sets, and
any descriptive attributes of the relationship set.
 E.g.: table for relationship set borrower
Redundancy of Tables
n Many-to-one and one-to-many relationship sets that are
total on the many-side can be represented by adding an
extra attribute to the many side, containing the primary
key of the one side

n E.g.: Instead of creating a table for relationship account-


branch, add an attribute branch to the entity set account
Redundancy of Tables (Cont.)
 For one-to-one relationship sets, either side can be chosen to
act as the “many” side
◦ That is, extra attribute can be added to either of the tables
corresponding to the two entity sets

 If participation is partial on the many side, replacing a table


by an extra attribute in the relation corresponding to the
“many” side could result in null values

 The table corresponding to a relationship set linking a weak


entity set to its identifying strong entity set is redundant.
◦ E.g. The payment table already contains the information that
would appear in the loan-payment table (i.e., the columns
loan-number and payment-number).
Representing Specialization as Tables
 Method 1:
◦ Form a table for the higher level entity
◦ Form a table for each lower level entity set, include primary key of
higher level entity set and local attributes

table table attributes


personname, street, city
customer name, credit-rating
employee name, salary

◦ Drawback: getting information about, e.g., employee requires


accessing two tables
Representing Specialization as Tables
Method 2: (Cont.)
◦ Form a table for each entity set with all local and inherited attributes

table table attributes


personname, street, city
customer name, street, city, credit-rating
employee name, street, city, salary

◦ If specialization is total, table for generalized entity (person) not


required to store information
 Can be defined as a “view” relation containing union of
specialization tables
 But explicit table may still be needed for foreign key constraints

◦ Drawback: street and city may be stored redundantly for persons


who are both customers and employees
End of Unit 1

Thanks…….

You might also like