Chapter 4 Data Modeling
Chapter 4 Data Modeling
Chapter 4 Data Modeling
A model is an abstraction process that hides unneeded details. Data modeling is used for
representing entities of interest and their relationship in the database. Data model is a
collection of concepts that can be used to describe the structure of a database which provides
the necessary means to achieve the abstraction.
Conceptual, logical and physical model are three different ways of modeling data in a domain.
While they all contain entities and relationships, they differ in the purposes they are created for
and audiences they are meant to target. A general understanding to the three models is that,
business analyst uses conceptual and logical model for modeling the data required and
produced by system from a business angle, while database designer refines the early design to
produce the physical model for presenting physical database structure ready for database
construction.
User level data model is the high level or conceptual model. This provides concepts that are
close to the way that many users perceive data. The purpose of a Conceptual model is to simply
establish the Entities, their Attributes and their ‘high-level’ relationships.
Figure 1 Example of Conceptual Data Model
Features of conceptual data model include:
Includes the important entities and the relationships among them.
No attribute is specified.
No primary key is specified.
We can see that the complexity increases from conceptual to logical to physical. This is why we
always first start with the conceptual data model (so we understand at high level what are the
different entities in our data and how they relate to one another), then move on to the logical
data model (so we understand the details of our data without worrying about how they will
actually implemented), and finally the physical data model (so we know exactly how to
implement our data model in the database of choice). In a data warehousing project,
sometimes the conceptual data model and the logical data model are considered as a single
deliverable.
Data modeling using Entity Relation Diagram (ERD)
A database schema in the ER model can be represented pictorially by using Entity-Relationship
diagram. An Entity-Relationship diagram (ER diagram) is a graph with nodes representing entity
sets, attributes and relationship sets.
Entity: real-world object or thing with an independent existence and which is distinguishable
from other objects. Examples are a person, car, customer, product, gene, book etc.
Attributes: an entity is represented by a set of attributes (its descriptive properties), e.g.,
name, age, salary, price etc. Attribute values that describe each entity become a major part of
the data eventually stored in a database. With each attribute a domain is associated, i.e., a set
of permitted values for an attribute. Possible domains are integer, string, date, etc.
Entity Type: Collection of entities that all have the same attributes, e.g., persons, cars,
customers etc.
Entity Set: Collection of entities of a particular entity type at any point in time; entity set is
typically referred to using the same name as entity type.
Entities of an entity type need to be distinguishable. A super key of an entity type is a set of
one or more attributes whose values uniquely determine each entity in an entity set. A
candidate key of an entity type is a minimal (in terms of number of attributes) super key. For
an entity type, several candidate keys may exist. During conceptual design, one of the
candidate keys is selected to be the primary key of the entity type.
Relationship (instance): association among two or more entities.
E.g.
Customer 'Smith' orders product 'PC42' "
Miller works in Pharmacy department.
Degree of a relationship: refers to the number of entity types that participate in the
relationship type (binary, ternary . . .).
Roles: The same entity type can participate more than once in a relationship type.
Role labels clarify semantics of a relationship, i.e., the way in which an entity participates in a
relationship.
Multivalued Attributes: An attribute that can hold multiple values is known as multivalued
attribute. We represent it with double ellipses in an E-R Diagram. E.g. A person can have more
than one phone numbers so the phone number attribute is multivalued.
Derived Attribute: A derived attribute is one whose value is dynamic and derived from another
attribute. It is represented by dashed ellipses in an E-R Diagram. E.g. Person age is a derived
attribute as it changes over time and can be derived from another attribute (Date of birth).
Role labels clarify semantics of a relationship, i.e., the way in which an entity participates in a
relationship.
Meaning: An employee can work in many departments, and a department can have several
employees.
Many-To-One
Meaning: An employee can work in at most one department, and a department can have
several employees.
One-To-Many
Meaning: An employee can work in many departments, but a department can have at most one
employee.
One-To-One
Meaning: An employee can work in at most one department, and a department can have at
most one employee.