0% found this document useful (0 votes)
64 views20 pages

Chapter-2 Modelling Data-Edid 451 Sem

Uploaded by

om55500r
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views20 pages

Chapter-2 Modelling Data-Edid 451 Sem

Uploaded by

om55500r
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

CHAPTER-2 1.

1 MODELING DATA
Objectives
• Define terms
• Understand importance of data modeling
• Write good names and definitions for entities, relationships, and attributes
• Distinguish unary, binary, and ternary relationships

E-R Model Constructs


 Entities:
 Entity instance–person, place, object, event, concept (often corresponds to a row in a table)
 Entity Type–collection of entities (often corresponds to a table)
 Relationships:
 Relationship instance–link between entities (corresponds to primary key-foreign key equivalencies
in related tables)
 Relationship type–category of relationship…link between entity types
 Attributes:
 Properties or characteristics of an entity or relationship type (often corresponds to a field in a
table)

Sample E-R Diagram

Entity relationship (ER) diagrams include rectangles representing entities, and lines between the entities
representing relationships. Relationships have cardinalities, which can be one-to-one, one-to-many, or many-
to-many. In addition, on each side of the relationship you can specify that it is mandatory or optional.

1
Basic E-R notation

Entities can be strong, weak, or associative. This will be explained later.

In addition to cardinalities, relationships also have degrees. A unary degree represents a relationship between
entities of the same entity type. A binary degree represents a relationship between entities of two different
entity types. A ternary degree represents a relationship between entities of three different entity types. In
principle you can have relationships between any number of entity types, so the term for this degree is “n-ary”.

Business Rules
- Are statements that define or constrain some aspect of the business
- Are derived from policies, procedures, events, functions
- Assert business structure
- Control/influence business behavior
- Are expressed in terms familiar to end users
- Are automated through DBMS software
A business rule is a statement that defines or constrains some aspect of the business. It is intended to assert
business structure or to control or influence the behavior of the business.

Not all business rules are implemented in a database, and it is the responsibility of the database analyst to
determine which business rules can be expressed through ER models and which cannot.

A Good Business Rule Is:


 Declarative–what, not how
 Precise–clear, agreed-upon meaning
 Atomic–one statement
 Consistent–internally and externally
 Expressible–structured, natural language
 Distinct–non-redundant
 Business-oriented–understood by business people

Business rules appear (possibly implicitly) in descriptions of business functions, events, policies, units,
stakeholders, and other objects. These descriptions can be found in interview notes from individual and group
information systems requirements collection sessions, organizational documents (e.g., personnel manuals,
2
policies, contracts, marketing brochures, and technical instructions), and other sources. Rules are identified by
asking questions about the who, what, when, where, why, and how of the organization.

Gathering business rules requires good interviewing and listening skills. As a database analyst, you will ask
questions about the who, what, when, where, why, and how of the organization. You’ll need to be persistent in
clarifying initial statements of rules because initial statements may be vague or imprecise, so this is an iterative
inquiry process.

A Good Data Name Is:


 Related to business, not technical, characteristics
 Meaningful and self-documenting
 Unique
 Readable
 Composed of words from an approved list
 Repeatable
 Written in standard syntax

Data objects must be named and defined before they can be used unambiguously in a model of organizational
data. Data names refer to the names of entities, their attributes, and their relationships, which are the data
objects. These names should be meaningful to the business interests and operations. In a sense, data names
should be “self-documenting”, which means they should “obviously” capture the essence of the data object.

Data Definitions:
 Explanation of a term or fact
 Term–word or phrase with specific meaning
 Fact–association between two or more terms
 Guidelines for good data definition
 A concise description of essential data meaning
 Gathered in conjunction with systems requirements
 Accompanied by diagrams
 Achieved by consensus, and iteratively refined

It is difficult to obtain a universally agreed-upon data definition. So, you may want to use multiple definitions to
cover the various situations, or alternatively Use a very general definition that will cover most situations. With data
definitions, as with most organizational knowledge, the person who controls the meaning of data controls the data.
Thus, the definition of data is a source of organizational power.

Entities:
 Entity – a person, a place, an object, an event, or a concept in the user environment about which the
organization wishes to maintain data
 Entity type – a collection of entities that share common properties or characteristics
 Entity instance – A single occurrence of an entity type
It is important to distinguish an entity instance from an entity type. For example, an entity may be John Doe, a
particular person. But the entity type is “Person” as a concept. When you develop ER diagrams, the boxes
represent entity types, not entity instances. Although we use the word “entity” when describing ER diagrams,
what we are really talking about is “entity types”.

3
Entity Type and Entity Instances:

Here we see the distinction between an entity type and an entity instance. The entity type is represented in the first
two columns of this figure. It includes the names of the various attributes (remember what we talked about
regarding data names), as well as the types of data. By contrast, the third and fourth columns represent two entity
instances. These would be actual records (or rows) in the final database table that implements this entity type.

An Entity…
 SHOULD BE:
 An object that will have many instances in the database
 An object that will be composed of multiple attributes
 An object that we are trying to model
 SHOULD NOT BE:
 A user of the database system
 An output of the database system (e.g., a report)
A common mistake people make when they are learning to draw E-R diagrams, especially if they are already
familiar with data process modeling (such as data flow diagramming), is to confuse data entities with other
elements of an overall information systems model. A simple rule to avoid such confusion is that a true data
entity will have many possible instances, each with a distinguishing characteristic, as well as one or more other
descriptive pieces of data.

4
Figure 2-4 Example of inappropriate entities

This figure illustrates a mistake many novices will make. The treasurer is a user of the system, and the expense
report is an output of the system. Neither of these are entities that should be represented in the database or the ER
model. The ER model should represent the objects that are of interest to the user and that will be displayed in the
system output.

Strong vs. Weak Entities, and Identifying Relationships:


 Strong entity
 exists independently of other types of entities
 has its own unique identifier
 identifier underlined with single line
 Weak entity
 dependent on a strong entity (identifying owner)…cannot exist on its own
 does not have a unique identifier (only a partial identifier)
 entity box and partial identifier have double lines
 Identifying relationship
 links strong entities to weak entities
Most of the basic entity types to identify in an organization are classified as strong entity types. A strong entity
type is one that exists independently of other entity types, so sometimes these are called “independent” entity
types. In contrast, a weak entity type is an entity type whose existence depends on some other entity type, so
these are sometimes called “dependent” entity types.

Figure 2-5 Example of a weak identity and its identifying relationship:


This figure shows an ER diagram
depicting an identifying
relationship between an
identifying owner (the employee)
and a weak entity (the employer’s
dependent). Note that the
dependent’s identifier is only a
partial identifier. The full
identification requires the
identifying owner’s identifier as
well. Also note the double lines
that distinguish the weak entity
and the identifying relationship.
5
Guidelines for Naming and Defining Entities:
 Names:
 Singular noun
 Specific to organization
 Concise, or abbreviation
 For event entities, the result not the process
 Name consistent for all diagrams
 Definitions:
 “An X is…”
 Describe unique characteristics of each instance
 Explicit about what is and is not the entity
 When an instance is created or destroyed
 Changes to other entity types
 History that should be kept
In addition to general guidelines about naming and defining data objects, there are some specific guidelines for
naming entity types. These are listed here.

Attributes:
 Attribute–property or characteristic of an entity or relationship type
 Classifications of attributes:
 Required versus Optional Attributes
 Simple versus Composite Attribute
 Single-Valued versus Multivalued Attribute
 Stored versus Derived Attributes
 Identifier Attributes
In naming attributes, we use an initial capital letter followed by lowercase letters. If an attribute name consists
of more than one word, we use a space between the words

and we start each word with a capital letter, for example, Employee Name or Student Home Address. In E-R
diagrams, we represent an attribute by placing its name in the entity it describes.

Required vs. Optional Attributes:

6
Required – must have a value for every entity (or relationship) instance with which it is associated
Optional – may not have a value for every entity (or relationship) instance with which it is associated
This figure illustrates the various properties of an entity’s attributes. Required attributes must have a value,
whereas optional attributes could be null. Note that the identifier is ALWAYS required.
In this case, the student’s major is optional because a student may not yet have declared a major. All the other
attributes, however, are required.
Simple vs. Composite Attributes:
 Composite attribute – An attribute that has meaningful component parts (attributes)

Sometimes many attributes are related to each other, such as the elements of an address. In this case they can be
grouped into a composite attribute. For simplicity, we can refer to the “employee address”, but if we want more
detail we can break it into street, city, state, and postal code. So, this way we have the option to describe the
attribute at a macro or at a micro level. Note the use of parentheses for encompassing the components of a
composite attribute.

Multi-valued and Derived Attributes:


Multivalued – may take on more than one value for a given entity (or relationship) instance
Derived – values can be calculated from related attribute values (not physically stored in the database)

7
A multivalued attribute is not the same as a composite attribute, although novices may confuse these terms. A
composite attribute is one that has many parts, such as an address composed of street, city, state, and zip. By
contrast, a multivalued attribute is one that can have many different values, such as an employee being able to do
many things.

Note that a derived attribute is not one that is physically stored in the database, but rather one that is calculated
based on the value of another. The length of time employed, or a person’s age, are classic examples, as they are
calculated based on a fixed starting point (date hired or birthdate).

Attributes could be both composite and multivalued, and even also derived. So these are distinct concepts.

Identifiers (Keys):
 Identifier (Key)–an attribute (or combination of attributes) that uniquely identifies individual instances of
an entity type
 Simple versus Composite Identifier
 Candidate Identifier–an attribute that could be an identifier…satisfies the requirements for being an
identifier
Every entity type should have an identifier attribute. No two instances of the entity type may have the same
value for the identifier attribute. For example, a person (employee, student, etc.) cannot rely on the first and
last name to be an identifier, because many people could have the same name. Rather, the identifier should be
something like an employee ID, a social security number, or some other absolutely unique value.

Criteria for Identifiers:


 Choose Identifiers that
 Will not change in value
 Will not be null
 Avoid intelligent identifiers (e.g., containing locations or people that might change)
 Substitute new, simple keys for long, composite keys
An identifier in the ER model will eventually become a primary key in the resulting database table. We’ll see
this in a later chapter. Identifiers are required, so cannot be devoid of value. And it should be constant.
Consider an employee ID or a social security number. These do not change. A person’s name or home address,
however, could change. Also, identifiers must be unique. Several people could have the same name.

Figure 2-9 Simple and composite identifier attributes:

8
In the ER diagram, and identifier will be underlined. Note also that required attributes are typically boldfaced, so
all identifiers will be boldfaced as well. If an identifier is composite, then all its component parts are required.

Naming Attributes:
 Name should be a singular noun or noun phrase
 Name should be unique
 Name should follow a standard format
 e.g. [Entity type name { [ Qualifier ] } ] Class
 Similar attributes of different entity types should use the same qualifiers and classes

As with all other data objects, there are guidelines for naming and defining attributes. These are listed in this slide
and the next.
A common naming format is [Entity type name { [ Qualifier ] } ] Class, where [ . . . ] is an optional clause, and { . . . }
indicates that the clause may repeat. Entity type
name is the name of the entity with which the attribute is associated. The entity type name may be used to make
the attribute name explicit. It is almost always used for the identifier attribute (e.g., Customer ID) of each entity
type.
Class is a phrase from a list of phrases defined by the organization that are the permissible characteristics or
properties of entities (or abbreviations of these characteristics). For example, permissible values (and associated
approved abbreviations) for Class might be Name (Nm), Identifier (ID), Date (Dt), or Amount (Amt). Class is
required.
Qualifier (optional) is a phrase from a list of phrases defined by the organization that are used to place constraints
on classes.

Defining Attributes:
✓ State what the attribute is and possibly why it is important
✓ Make it clear what is and is not included in the attribute’s value
✓ Include aliases in documentation
✓ State source of values
✓ State whether attribute value can change once set
✓ Specify required vs. optional
✓ State min and max number of occurrences allowed
✓ Indicate relationships with other attributes

9
2.2 Modeling Data in the Organization
Database Development Process
Objectives
 Model different types of attributes, entities, relationships, and cardinalities
 Draw E-R diagrams for common business situations
 Convert many-to-many relationships to associative entities
 Model time-dependent data using time stamps

Modeling Relationships
Relationship Types vs. Relationship Instances
 The relationship type is modeled as lines between entity types…the instance is between specific
entity instances
Relationships can have attributes
 These describe features pertaining to the association between the entities in the relationship
Two entities can have more than one type of relationship between them (multiple relationships)
Associative Entity–combination of relationship and entity

Figure 2-10 Relationship types and instances

This figure illustrates the difference between relationship types and relationship instances. The ER diagram
depicts types. It depicts both entity types and relationship types. The actual data that would be in the database
constitutes instances, both relationship and entity instances.

10
Degree of Relationships
Degree of a relationship is the number of entity types that participate in it
 Unary Relationship
 Binary Relationship
 Ternary Relationship
Most relationships are of binary degree. But it is possible to have any number of entities involved in a relationship
“Ternary” refers to three. If you have more than that, it is sometimes referred to generically as an “n-ary”
relationship.

Degree of relationships – from Figure 2-2

One example of unary


relationships would be
supervisor-subordinate
relationships, which exists
between employees.

Cardinality of Relationships
One-to-One
 Each entity in the relationship will have exactly one related entity
One-to-Many
 An entity on one side of the relationship can have many related entities, but an entity on the other side
will have a maximum of one related entity
Many-to-Many
 Entities on both sides of the relationship can have many related entities on the other side

Figure 2-12 Examples of relationships of different degrees


a) Unary relationships
Although this figure of unary relationships
shows only one-to-one and one-to-many
cardinalities, it is also possible to have
many-to-many unary relationships. For
example consider a Person entity with a
“friend” relationship. A particular person
can have many friends, and each friend
could in turn have other friends. This is
different from the one-to-many relationship
of employees. Although a supervisor could 11
manage many subordinates, typically a
subordinate only reports to one supervisor.
b) Binary relationships

Here are binary degree relationships


with all the different possible
cardinalities.

c) Ternary relationships

The cardinality of this ternary relationship is many-to-many-to-many. In other words, each vendor could supply
many parts to many warehouses. Each part could come from many vendors and housed in many warehouses. Each
warehouse could have many parts from many vendors.
The dashed line is a way of representing the attributes of the relationship. For a given vendor supplying a given
part to a given warehouse, here is a shipping mode and a unit cost. Each of these ternary relationship instances
could have its own shipping mode and unit cost.

12
Cardinality Constraints
Cardinality Constraints—the number of instances of one entity that can or must be associated with each
instance of another entity
Minimum Cardinality
 If zero, then optional
 If one or more, then mandatory
Maximum Cardinality
 The maximum number
Sometimes it is required for an entity to have its related entity, and sometimes not. Also, it is possible for there to
be a limit to how many related entities a given entity could be related to.

Figure 2-17 Examples of cardinality constraints


a) Mandatory cardinalities

Note the hatch mark vs. the circle. The hatch mark illustrated mandatory cardinalities, whereas the circle
represents optional cardinalities. This figure indicates that each patient must have had at least one visit
(mandatory), and could have many more (many). By contrast, each patient history (visit) record must be
associated with exactly one patient.
Note that in all these ER diagrams cardinality is represented using something called “crow’s-feet” notation. The
three prongs on the many side of the relationship is called a “crow’s foot”. There are other possible notations in ER
diagram. For example, Microsoft Visio by default shows an arrow from the many side to the one side, although you
can change it to crow’s feet notation.

13
b) One optional, one mandatory

This figure shows a binary many-to-many relationship. In this case one side is mandatory and the other is optional.
Here every project must have at least one employee assigned to it, but an employee could possibly not be assigned
to any projects.

c) Optional cardinalities

This is a unary one-to-one relationship. According to this, a person could be married to one or no other person.
This figure rules out polygamy. Can you see why? How would we be able to allow polygamy in this ER diagram?
(Answer: make it many-to-many).
Dawn and Fred are single. Shirley is married to Ellis and Mack is married to Kathy.

14
Figure 2-21 Examples of multiple relationships

a) Employees and departments

Here, we see a one-to-many unary relationship between employees. It shows that a given employee MUST have
exactly one supervisor and could supervise any number of other employees (or none at all).
We also see two binary relationships between employees and departments. First, each department must have at
least one, and possibly many, employees. Each employee must work in exactly one department. Also, each
department has exactly one employee as a manager, and each employee can manage at most one department, or
possible none at all.
The figure illustrates that there could be multiple types of relationships between entities.

b) Professors and courses (fixed lower limit constraint)


Here, min cardinality constraint is 2. At least two professors must be qualified to teach each course. Each
professor must be qualified to teach at least one course.

Again, we see multiple relationships between two entities, this time between professors and courses. The “Is
Qualified” relationship is of binary degree and mandatory many-to-many cardinality. A professor must be qualified
to teach at least one course. And a course must have at least two qualified professors.
The other relationship is actually implemented via what is called an “associative entity” called Schedule, which
has an identifier attribute called Semester. We will shortly talk about associative entities in more detail. This
associative entity is implementing a many to many relationship between professors and courses, indicating that a
particular professor may be scheduled during a particular semester to many courses, and vice versa.

15
Figure 2-15a and 2-15b Multivalued attributes can be represented as relationships

In this figure we see two examples of multivalued attributes on the left. On the right we see instead separate
entities with relationships.
The top figure shows a simple multivalued attribute, whereas the bottom figure shows a composite multivalued
attribute. Note that on the right, it is explicit that there are many to many relationships. For example, although the
left side shows that a course can have many prerequisites, there is nothing explicit showing that a course could
itself be a prerequisite for multiple other courses. Similarly, on the left it is explicitly shown that an employee can
many many skills, but it is not explicitly shown that many employees can share the same skill. The figures on the
right, however, do make these facts explicit.
The right side of each figure is in Microsoft Visio notation.

16
Associative Entities
An entity–has attributes
A relationship–links entities together
When should a relationship with attributes instead be an associative entity?
✓ All relationships for the associative entity should be many
✓ The associative entity could have meaning independent of the other entities
✓ The associative entity preferably has a unique identifier, and should also have other attributes
✓ The associative entity may participate in other relationships other than the entities of the
associated relationship
✓ Ternary relationships should be converted to associative entities

Figure 2-11a A binary relationship with an attribute

Here, the date completed attribute pertains specifically to the employee’s completion of a course…it is an attribute
of the relationship.

Here, the relationship simply states that an employee completed a course on a particular date. The completion is
represented as a relationship, and is not an entity unto itself.

Figure 2-11b An associative entity (CERTIFICATE)

Associative entity is like a relationship with an attribute, but it is also considered to be an entity in its
own right.
Note that the many-to-many cardinality between entities in Figure 2-11a has been replaced by two one-
to-many relationships with the associative entity.

Here, the simple relationship has been replaced with an associative entity. A certificate is considered to be an
entity unto itself, and in fact even has a unique identifier attribute.

17
Figure 2-13c An associative entity – bill of materials structure

This could just be a relationship


with attributes…it’s a judgment
call.

Here we see another example of an associative entity representing a bill of materials structure. If not for an
associative entity, the BOM structure would be represented as a unary many-to-many relationship between items.

Figure 2-18 Cardinality constraints in a ternary relationship

Here is another example of an associative entity, this time as the center of a ternary relationship.

18
Figure 2-19 Simple example of time-stamping

The Price History attribute is


both multivalued and
composite.

Time stamp – a time value


that is associated with a data
value, often indicating when
some event occurred that
affected the data value
Time stamps are useful for keeping historical data. In this case, we see a way of keeping track of price
changes over time for products.
Can you think of how the price history could be represented as a separate entity instead of a multivalued
attribute? What would be the cardinality of the relationship between product and price history? (Answer:
one-to-many).

Figure 2-20c E-R diagram with associative entity for product assignment to product line
over time
Modeling time-dependent data
has become more important due to
regulations such as HIPAA and
Sarbanes-Oxley.

The Assignment associative entity


shows the date range of a product’s
assignment to a particular product
line.

19
Figure 2-22 Data model for Pine Valley Furniture Company in Microsoft Visio notation

Different modeling
software tools may
have different
notation for the
same constructs.

As you can see, data


models can be quite
comprehensive,
including many
different entities and
relationships.

20

You might also like