0% found this document useful (0 votes)
53 views64 pages

Dbms Chapter 3

Uploaded by

gbrhailu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views64 pages

Dbms Chapter 3

Uploaded by

gbrhailu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 64

UNIT THREE

CONCEPTUAL DATABASE DESIGN


AND ER-MODEL

DBMS I 1
Database Design
 Database design is the process of coming up with different
kinds of specification for the data to be stored in the
database.
 The ability to design databases and associated applications is
critical to the success of the modern enterprise.
 Database design requires understanding both the
operational and business requirements of an organization
as well as the ability to model and realize those
requirements using a database.
 Developing database and information systems is performed
using a development lifecycle, which consists of a series of
steps.
 As it is one component in most information system
development tasks, there are several steps to follow in
designing a database system.DBMS I 2
 The requirements gathering and specification provides you
with a high-level understanding of the organization, its
data, and the processes that you must model in the
database.
 Database design involves constructing a suitable model of
this information.
 Since the design process is complicated, especially for
large databases, database design is mainly focused on this
three phases:
• Conceptual Design
• Logical Design, and
• Physical Design
 In general, one has to go back and forth between these
tasks to refine a database design, and decisions in one task
DBMS I 3
can influence the choices in another task.
DBMS I 4
1. Conceptual Database Design
 construct the organizational and user data requirements
discovered and analyzed during requirement analysis.
 is the process of constructing a model of the
information used in an enterprise.
 is used as input or source of information for the logical
database design phase.
 it is a phase which is independent of all physical
considerations (DBMS, OS, . . . ).
 it is carried out using entity-relationship model
(conceptual level, conceptual schema).
 The conceptual design activities are:
1. Identify all entities and their relationships
DBMS I 5
 Questions that are addressed during conceptual
design:
 What are the entities and relationships of interest?
 What information about entities and relationships
among entities needs to be stored in the database?
 What are the constraints (or business rules) that (must)
hold for the entities and relationships?
 A database schema in the ER model can be represented
pictorially using (Entity-Relationship diagram).

DBMS I 6
REASONS FOR CONCEPTUAL MODELING

 Because conceptual modeling has the following


advantages:
 helps users and system developers to identify data
requirements (abstract model)
 helps in understanding how existing systems can be
modified/maintained
 allows for easy communication between end-users and
developers.
 independent of DBMS or any OS.
 has a clear method to convert from high-level model to
relational model.
 it is a permanent description of the database
requirements.
DBMS I 7
ER Modeling
 is an abstract and conceptual representation of data.
 is used to produce a type of conceptual schema of a system.
 used to interpret, specify & document requirements for DBs
irrespective of DBMS being used.
 The Sequence: Conceptual data model (i.e. ER) is, at a later
stage (called logical design), mapped to a logical data model,
(e.g. relational model); this is mapped to a physical model in
physical design.
 Diagrams created by this process are called ER diagrams.
 E-R diagrams provide a visual, graphical model of the
information content of a system.
 Developers create E-R diagrams that represent their
understanding of user requirements.
 Users then carefully evaluate the E-R diagrams to make sure that
their needs are being met. DBMS I 8
 Once the E-R diagram has been approved by the user
community, the diagram provides the specification of what
must be accomplished by the developers.
 In the presence of accurate models, developers can be
confident that they are building useful systems.
 Without a precise description of the agreement between users
and developers, the system is doomed to failure.
 Our goal, then, is to produce a data model that is
understandable to users and that accurately and precisely
describes the structure of the information to be stored in the
database.
 The relational data model and ERM combined to provide the
foundation for tightly structured database design.
 ER models are normally represented in an entity relationship
diagram (ERD), which uses DBMS I
graphical representations9 to
model database components.
COMPONENTS OF ER MODEL
 The ER model is based on the following components:
A) Entity
 real world physical or logical object with an independent
existence and which is distinguishable from other objects.
 is represented in the ERD by a rectangle, also known as an
entity box.
 The name of the entity, a noun, is written in the center of
the rectangle.
 Entity name is written in capital letters and is written in the
singular form:
 PAINTER rather than PAINTERS, and EMPLOYEE
rather than EMPLOYEES.
 Usually, when applying the ERD to the relational model, an
entity is mapped to a relational table.
DBMS I 10
 Each row in the relational table is known as an entity
instance or entity occurrence in the ER model.
 Each entity is described by a set of attributes that describes
particular characteristics of the entity.
 For example, the entity EMPLOYEE will have attributes
such as a Social Security number, a last name, and a first
name.
Examples of Entities
 Persons: agency, contractor, customer, department,
division, employee, instructor, student, supplier.
 Places: sales region, building, room, branch office, campus.
 Objects: book, machine, part, product, raw material,
software license, software package, tool, vehicle model,
vehicle.
 Events: application, award, cancellation, class, flight,
invoice, order, registration, renewal, requisition, reservation,
sale, trip. DBMS I 11
B) Attributes
 are properties used to describe each Entity or real world object.
 are used to store pieces of information about entities.
 Attributes will give rise to recorded items of data in the
database
 For example, the STUDENT entity includes, among many
others, the attributes STU_LNAME, STU_FNAME, and
STU_INITIAL.
 In the original Chen notation, attributes are represented by ovals
and are connected to the entity rectangle with a line.

Figure 3.1 The attributes of the STUDENT entity: Chen Model 12


Types of Attribute
1. Required and Optional Attribute
 A required attribute is an attribute that must have a value;
in other words, it cannot be left empty.
 Figure 3.1 in the previous slide shows, there are two
attributes for example, STU_LNAME and STU_FNAME
require data entries because of the assumption that all
students have a last name and a first name.
 But students might not have a middle name, and perhaps
they do not (yet) have a phone number and an e-mail
address.
 Therefore, those attributes may not require to enter data.
 An optional attribute is an attribute that does not require a
value; therefore, it can be left empty.
DBMS I 13
2. Identifiers (Primary Keys) Attribute
 Identifier attribute or Key is an attribute (or combination
of attributes) that uniquely identifies individual instances
of an entity type, such as Student_ID.
 Identifiers are underlined in the ERD.
 Identifier can be simple or composite identifier

(a) Simple key attributeDBMS I 14


 Ideally, an entity identifier is composed of only a single
attribute.
 However, it is possible to use a composite identifier, that is,
a primary key composed of more than one attribute.
 A Composite Identifier is when there is no single (or
atomic) that can serve as an identifier
 Flight_ID is a composite identifier that has component
attributes Flight_Number and Date – this combination is
required to uniquely identify individual occurrences of
Flight.
 Flight_ID is underlined, while its components are not .
 If the Flight_ID attribute is deleted from FLIGHT entity, the
candidate keyes (Flight_Number and Date) becomes an
acceptable composite primary key.
 The following figure shows a primary key composed of
15
more than one attribute. DBMS I
(b) Composite key attribute

DBMS I 16
3. Composite and Simple Attributes
 A composite attribute, is an attribute that can be
further subdivided to yield additional attributes.
 For example, the attribute ADDRESS can be
subdivided into street_Address, city, state, and Postal
code.
 Similarly, the attribute PHONE_NUMBER can be
subdivided into area code and exchange number.
 A simple attribute is an attribute that cannot be
subdivided.
 For example, age, sex, and marital status would be
classified as simple attributes.
 To facilitate detailed queries, it is wise to change
composite attributes into a series of simple attributes.
DBMS I 17
An attribute
broken into
component parts

Figure 3.2 A composite attribute


DBMS I 18
4. Single versus Multivalued Attributes
 A single-valued attribute is an attribute that can have
only a single value.
 For example, a person can have only one Social Security
number, and a manufactured part can have only one serial
number.
 Keep in mind that a single-valued attribute is not
necessarily a simple attribute.
 For instance, a part’s serial number, such as SE-08-02-
189935, is single-valued, but it is a composite attribute
because it can be subdivided into the region in which the
part was produced (SE), the plant within that region (08),
the shift within the plant (02), and the part number
(189935).
DBMS I 19
 Multivalued attributes are attributes that can have many
values.
 For instance, a person may have several college
degrees, and a household may have several different
phones, each with its own number.
 Similarly, a car’s color may be subdivided into many
colors (that is, colors for the roof, body, and trim).
 In the Chen ERM, the multivalued attributes are shown
by a double line connecting the attribute to the entity.
 In Figure 3.3, note that CAR_VIN is the primary key,
and CAR_COLOR is a multivalued attribute of the CAR
entity.
 Figure 3.4 shows Entity with a multivalued attribute
(Skill) and derived attribute (Years_Employed)
DBMS I 20
Figure 3. 3 A multivalued attribute in an entity

DBMS I 21
Figure 3. 4 Entity with a multivalued attribute (Skill)
and derived attribute (Years_Employed)
DBMS I 22
5. Derived versus Stored Attributes
 A derived attribute is an attribute whose value is
calculated (derived) from other attributes.
 Derived attributes are sometimes referred to as computed
attributes.
 The derived attribute need not be physically stored within
the database; instead, it can be derived by using an
algorithm.
 It is used to save storage space because computation
always yields current value.
 But the derived attributes have its own disadvantages such
as, uses CPU processing cycles, increases data access time
and adds coding complexity to queries.
 The attribute of an employee for example, Age called a
derived attribute and is said to be derivable from the
BirthDate attribute, which DBMS
is called
I
a stored attribute. 23
 Stored attribute- is an attribute in which the value is stored in
the attribute of the entity. For example, Birthdates is called a
Stored attribute.
 Stored attribute on the other hand, saves CPU processing
cycles, saves data access time, data value is readily available
and can be used to keep track of historical data
 However, stored attributes has disadvantages such as, requires
constant maintenance to ensure derived value is current,
especially if any values used in the calculation change.

DBMS I 24
Fig 3.5 Depiction of a derived attribute
Domains
 Attributes have a domain. A domain is the set of possible values
for a given attribute.
 For example, the domain for the grade point average (GPA)
attribute is written (0,4) because the lowest possible GPA value
is 0 and the highest possible value is 4.
 The domain for the gender attribute consists of only two
possibilities: M or F (or some other equivalent code).
 Attributes may share a domain. For instance, a student address
and a professor address share the same domain of all possible
addresses.
 In fact, the data dictionary may let a newly declared attribute
inherit the characteristics of an existing attribute if the same
attribute name is used.
 For example, the PROFESSOR and STUDENT entities may
each have an attribute namedDBMSADDRESS
I and could therefore
25
Example 1: Build an ER Diagram for the
following information:
• A student record management system will have
the following two basic data object categories with
their own features or properties: Students will
have an Id, Name, Dept, Age, GPA and Course
will have an Id, Name, Credit Hours . Whenever a
student enroll in a course in a specific Academic
Year and Semester, the Student will have a grade
for the course

26
27
Example 2: Build an ER Diagram for the following
information:
• A Personnel record management system will have
the following two basic data object categories
with their own features or properties: Employee
will have an Id, Name, DoB, Tel.
• Department will have an Id, Name, Location
• Whenever an Employee is assigned in one
Department, the duration of his stay in the
respective department should be registered.

28
Example 3

• A company database needs to store information


about employees (identifyied by ssn, with salary
and phone as attributes); departments (identified by
dno, with dname and budget as attributes); and
children of employees (with name and age as
attributes). Employees work in departments; each
department is managed by an employee; a child
must be identified uniquely by name when the
parent (who is an employee; assume that only one
parent works for the company) is known. We are
not interested in information about a child once the
parent leaves the company.
C) RELATIONSHIPS
 Relationship (relationship type) is a meaningful association
among entity types.
 Generally, a relationship is represented as a connection between
(or among) entities.
 In standard ER model, it uses a diamond shape to connect
between (or among) entities.
 There are several type of relationships based on the degree,
cardinality, and participation.
 The entities that participate in a relationship are also known as
participants, and each relationship is identified by a name that
describes the relationship.
 The relationship name is an active or passive verb; for example,
a STUDENT takes a CLASS, a PROFESSOR teaches a
CLASS, a DEPARTMENT employs a PROFESSOR, a
DIVISION is managed by anDBMS EMPLOYEE.
I 30
DBMS I 31
DBMS I 32
 The relationship classification is difficult to establish if
you know only one side of the relationship.
 For example, if you specify that:
 A DIVISION is managed by one EMPLOYEE.
 You don’t know if the relationship is 1:1or1:N.
 Therefore, you should ask the question “Can an employee
manage more than one division?”.
 If the answer is yes, the relationship is 1:N, and the second
part of the relationship is then written as:
 An EMPLOYEE may manage many DIVISIONs.
 If an employee cannot manage more than one division, the
relationship is 1:1, and the second part of the relationship
is then written as:
 An EMPLOYEE may manage only one DIVISION.
DBMS I 33
1. Connectivity and Cardinality
 The connectivity of a relationship describes the
mapping of associated entity instances in the
relationship.
 The values of connectivity are "one" or "many".
 The cardinality of a relationship is the actual
number of related occurrences for each of the two
entities.
 The basic types of connectivity for relations are:
one-to-one, one-to-many, and many-to-many.
A) A One-to-One (1:1) Relationship is when at most
one instance of an entity A is associated with one
instance of entity B.
DBMS I 34
 A customer is associated with at most one loan via the
relationship borrower.
 A loan is associated with at most one customer via
borrower
DBMS I 35
B) A one-to-many (1:N) relationships is when for
one instance of entity A, associated with zero, one,
or many instances of entity B, but for one instance
of entity B, there is only one instance of entity A.
 An example of a 1:N relationships is a
department has many employees each employee
is assigned to one department.

DBMS I 36
 An entity on one side of the relationship can have many related
entities, but an entity on the other side will have a maximum of
one related entity.

 In the one-to-many relationship:


 A customer is associated with several (including 0) loans via
borrower DBMS I 37
C) A many-to-many (M:N) relationship, sometimes called
non-specific, is when for one instance of entity A, there are
zero, one, or many instances of entity B and for one instance of
entity B there are zero, one, or many instances of entity A.
 An example is: employees can be assigned to no more than
two projects at the same time; projects must have assigned at
least three employees.
 A single employee can be assigned to many projects;
conversely, a single project can have assigned to it many
employee.
 Here the cardinality for the relationship between employees
and projects is two and the cardinality between project and
employee is three.
 Many-to-many relationships cannot be directly translated to
relational tables but insteadDBMS
must
I be transformed into two38 or
• A customer is associated with several (possibly 0)
loans via borrower
• A loan is associated with several (possibly 0)
customers via borrowerDBMS I 39
 Cardinality expresses the minimum and maximum number
of entity occurrences associated with one occurrence of the
related entity.
 In the ERD, cardinality is indicated by placing the
appropriate numbers beside the entities, using the format
(x,y).
 The first value (x), represents the minimum number of
associated entities, while the second value (y) represents the
maximum number of associated entities.
 Express the number of entities to which another entity can
be associated via a relationship set. Mostly useful in
describing binary relationship sets.
 We express cardinality constraints by drawing either a
directed line (→), signifying “one,” or an undirected line
(—), signifying “many,” between the relationship set and
the entity set. DBMS I 40
• Keep in mind that the cardinalities represent the number
of occurrences in the related entity.
• For example, the cardinality (1,4) written next to the
CLASS entity in the “PROFESSOR teaches CLASS”
relationship indicates that each professor teaches up to four
classes, which means that the PROFESSOR table’s primary
key value occurs at least once and no more than four times
as foreign key values in the CLASS table.
• If the cardinality had been written as (1,N), there would be
no upper limit to the number of classes a professor might
teach.
• Similarly, the cardinality (1,1) written next to the
PROFESSOR entity indicates that each class is taught by
one and only one professor.
• That is, each CLASS entity occurrence is associated with
DBMS I 41
one and only one entity occurrence in PROFESSOR.
(1,4)

Figure 3.10. Cardinality in ERD


DBMS I 42
2. Relationship Participation
• Participation in an entity relationship is either optional
(partial participation) or mandatory (total participation).
• Recall that relationships are bidirectional; that is, they
operate in both directions.
• If COURSE is related to CLASS, then by definition, CLASS
is related to COURSE.
• Because of the bidirectional nature of relationships, it is
necessary to determine the connectivity of the relationship
from COURSE to CLASS and the connectivity of the
relationship from CLASS to COURSE.
• Similarly, the specific maximum and minimum cardinalities
must be determined in each direction for the relationship.
• Once again, you must consider the bidirectional nature of the
relationship when determining participation.
DBMS I 43
A) Optional (Partial) participation means that one entity
occurrence does not require a corresponding entity occurrence
in a particular relationship.
• For example, in the “COURSE generates CLASS”
relationship, you noted that at least some courses do not
generate a class.
• In other words, an entity occurrence in the COURSE table
does not necessarily require the existence of a corresponding
entity occurrence in the CLASS table.
• (Remember that each entity is implemented as a table.)
Therefore, the CLASS entity is considered to be optional to
the COURSE entity.
• The existence of an optional entity indicates that the
minimum cardinality is 0 for the optional entity.
• (The term optionality is used to label any condition in which
DBMS I 44
one or more optional relationships exist.)
B) Mandatory participation means that one entity
occurrence requires a corresponding entity occurrence in
a particular relationship.
• The existence of a mandatory relationship indicates that the
minimum cardinality is at least 1 for the mandatory entity.
• Let’s examine a few more scenarios.
• Suppose that Tiny College employs some professors who
conduct research without teaching classes.
• If you examine the “PROFESSOR teaches CLASS”
relationship, it is quite possible for a PROFESSOR not to
teach a CLASS.
• Therefore, CLASS is optional to PROFESSOR.
• On the other hand, a CLASS must be taught by a
PROFESSOR.
DBMS I 45

• Note that the ERD model in the figure below, shows the
cardinality next to CLASS to be (0,3), thus indicating that a
professor may teach no classes at all or as many as three
classes.
• And each CLASS table row will reference one and only one
PROFESSOR row—assuming each class is taught by one
and only one professor—represented by the (1,1) cardinality
next to the PROFESSOR table.

(1,1) (0,3)

DBMS I 46
3. Relationship Degree
• Degree of a relationship: refers to the number of entity
types that participate in the relationship type (unary,
binary, ternary).
A) Unary Relationships
 A unary relationship exists when an association is
maintained within a single entity.
B) Binary Relationships
• A binary relationship exists when two entities are
associated in a relationship. Binary relationships are most
common.
C) Ternary Relationships
• A ternary relationship implies an association among three
different entities.
DBMS I 47
Figure 3. Degree of Relationships

DBMS I 48
5. Recursive Relationships
 A recursive entity is one in which a relationship can
exist between occurrences of the same entity set.
 A recursive entity is found within a unary relationship.
• A relationship type with the same participating entity type
in distinct roles
• Example: the SUPERVISION relationship
• EMPLOYEE participates twice in two distinct roles:
– supervisor (or boss) role
– supervisee (or subordinate) role
• Each relationship instance relates two distinct
EMPLOYEE entities:
– One employee in supervisor role
– One employee in supervisee role
 Another Example, a COURSE may be a prerequisite to
a COURSE. DBMS I 49
Weak Entity Sets
• An entity set that does not have a primary key is referred
to as a weak entity set.
• The existence of a weak entity set depends on the
existence of a identifying entity set
 It must relate to the identifying entity set via a total, one-
to-many relationship set from the identifying to the weak
entity set
 Identifying relationship depicted using a double diamond
• The discriminator (or partial key) of a weak entity set is
the set of attributes that distinguishes among all the
entities of a weak entity set.
• The primary key of a weak entity set is formed by the
primary key of the strong entity set on which the weak
entity set is existence dependent,
DBMS I
plus the weak entity set’s
50
• We depict a weak entity set by double rectangles.
• We underline the discriminator of a weak entity set with a
dashed line.
• payment-number – discriminator of the payment entity set
• Primary key for payment – (loan-number, payment-
number)

DBMS I 51
• Note: the primary key of the strong entity set is not explicitly
stored with the weak entity set, since it is implicit in the
identifying relationship.
• If loan-number were explicitly stored, payment could be
made a strong entity, but then the relationship between
payment and loan would be duplicated by an implicit
relationship defined by the attribute loan-number common
to payment and loan
• In a university, a course is a strong entity and a course-
offering can be modeled as a weak entity
• The discriminator of course-offering would be semester
(including year) and section-number (if there is more than
one section)
• If we model course-offering as a strong entity we would
model course-number as an attribute.
• Then the relationship with DBMS
course
I would be implicit in 52the
Associative Entities

• An associative entity is an entity type that associates the


instances of one or more entity types and contains
attributes that are peculiar to the relationship between those
entity instances.
 It’s an entity – it has attributes
 AND it’s a relationship – it links entities together
• When should a relationship with attributes be an
associative entity?
 All relationships for the associative entity should be many
 The associative entity could have meaning independent of
the other entities.
 The associative entity should have at least one or more
attributes other than the identifier.
 The associative entity may participate in other relationships
other than the entities of the associated relationship.
 Ternary relationships should be converted to associative
53
entities.
• The following figure shows the relationship
‘Completes’ converted to an associative entity type
• A CERTIFICATE is awarded to each
EMPLOYEE who completes a COURSE, each
certificate has a Certificate_Number that serves as
the identifier

DBMS I 54
Figure 3-11b – An associative entity (CERTIFICATE) 55
Enhanced E-R (EER) Models

• Object-oriented extensions to E-R model


• EER is important when we have a relationship between two
entities and the participation is partial between entity
occurrences.
• In such cases EER is used to reduce the complexity in
participation and relationship complexity.
• ER diagrams consider entity types to be primitive objects
• EER diagrams allow refinements within the structures of
entity types . EER Concepts:
 Generalization
 Specialization
 Sub classes/Super classes
 Attribute Inheritance
 Aggregation
• Constraints on specialization and generalization 56
Specialization
• Top-down design process; we designate
subgroupings within an entity set that are
distinctive from other entities in the set.
• These subgroupings become lower-level entity sets
that have attributes or participate in relationships
that do not apply to the higher-level entity set.
• Depicted by a triangle component labeled ISA
(E.g. customer “is a” person).
• Attribute inheritance – a lower-level entity set
inherits all the attributes and relationship
participation of the higher-level entity set to which
it is linked.
Specialization Example
Generalization
• A bottom-up design process – combine a number of entity sets that
share the same features into a higher-level entity set.
• Specialization and generalization are simple inversions of each
other; they are represented in an E-R diagram in the same way.
• The terms specialization and generalization are used
interchangeably.
• Can have multiple specializations of an entity set based on
different features.
• E.g. permanent-employee vs. temporary-employee, in
addition to officer vs. secretary vs. teller
• Each particular employee would be
– a member of one of permanent-employee or temporary-employee,
– and also a member of one of officer, secretary, or teller
• The ISA relationship also referred to as superclass -
subclass relationship
Design Constraints on a Specialization/Generalization
• Constraint on which entities can be members of a given
lower-level entity set.
– condition-defined
• E.g. all customers over 65 years are members of
senior-citizen entity set; senior-citizen ISA person.
– user-defined
• Constraint on whether or not entities may belong to more
than one lower-level entity set within a single
generalization.
– Disjoint
• an entity can belong to only one lower-level entity set
• Noted in E-R diagram by writing disjoint next to the
ISA triangle
– Overlapping
• an entity can belong to more than one lower-level
entity set
Design Constraints on a Specialization/Generalization
(Contd.)
• Completeness constraint--specifies whether or
not an entity in the higher-level entity set must
belong to at least one of the lower-level entity sets
within a generalization.
– total : an entity must belong to one of the lower-level
entity sets
– partial: an entity need not belong to one of the lower-
level entity sets
Aggregation
 Consider the ternary relationship works-on, which we
saw earlier
 Suppose we want to record managers for tasks
performed by an employee at a branch
Aggregation (Cont.)
• Relationship sets works-on and manages represent
overlapping information
– Every manages relationship corresponds to a works-on
relationship
– However, some works-on relationships may not correspond to
any manages relationships
• So we can’t discard the works-on relationship
• Eliminate this redundancy via aggregation
 Treat relationship as an abstract entity
 Allows relationships between relationships
 Abstraction of relationship into new entity
• Without introducing redundancy, the following diagram
represents:
– An employee works on a particular job at a particular branch
– An employee, branch, job combination may have an associated
manager
E-R Diagram With Aggregation

You might also like