Integrity Rules and Constraints

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Integrity Rules and Constraints

ADRIENNE WATT & NELSON ENG

Constraints are a very important feature in a relational model. In fact, the relational
model supports the well-defined theory of constraints on attributes or tables.
Constraints are useful because they allow a designer to specify the semantics of
data in the database. Constraints are the rules that force DBMSs to check that data
satisfies the semantics.

Domain Integrity
Domain restricts the values of attributes in the relation and is a constraint of the
relational model. However, there are real-world semantics for data that cannot be
specified if used only with domain constraints. We need more specific ways to state
what data values are or are not allowed and which format is suitable for an
attribute. For example, the Employee ID (EID) must be unique or the employee
Birthdate is in the range [Jan 1, 1950, Jan 1, 2000]. Such information is provided in
logical statements called integrity constraints.

There are several kinds of integrity constraints, described below.

Entity integrity
To ensure entity integrity, it is required that every table have a primary key. Neither
the PK nor any part of it can contain null values. This is because null values for the
primary key mean we cannot identify some rows. For example, in the
EMPLOYEE table, Phone cannot be a primary key since some people may not
have a telephone.

Referential integrity
Referential integrity requires that a foreign key must have a matching primary key
or it must be null. This constraint is specified between two tables (parent and
child); it maintains the correspondence between rows in these tables. It means the
reference from a row in one table to another table must be valid.

Examples of referential integrity constraint in the Customer/Order database of the


Company:

• Customer(CustID, CustName)
• Order(OrderID, CustID, OrderDate)

To ensure that there are no orphan records, we need to enforce referential


integrity. An orphan record is one whose foreign key FK value is not found in the
corresponding entity – the entity where the PK is located. Recall that a typical join
is between a PK and FK.

The referential integrity constraint states that the customer ID (CustID) in the Order
table must match a valid CustID in the Customer table. Most relational databases
have declarative referential integrity. In other words, when the tables are created
the referential integrity constraints are set up.

Here is another example from a Course/Class database:

• Course(CrsCode, DeptCode, Description)


• Class(CrsCode, Section, ClassTime)

The referential integrity constraint states that CrsCode in the Class table must
match a valid CrsCode in the Course table. In this situation, it’s not enough that the
CrsCode and Section in the Class table make up the PK, we must also enforce
referential integrity.

When setting up referential integrity it is important that the PK and FK have the
same data types and come from the same domain, otherwise the relational database
management system (RDBMS) will not allow the join. RDBMS is a popular
database system that is based on the relational model introduced by E. F. Codd of
IBM’s San Jose Research Laboratory. Relational database systems are easier to use
and understand than other database systems.

Referential integrity in Microsoft Access


In Microsoft (MS) Access, referential integrity is set up by joining the PK in the
Customer table to the CustID in the Order table. See Figure 9.1 for a view of how
this is done on the Edit Relationships screen in MS Access.
Figure 9.1. Referential access in MS Access, by A. Watt.

Referential integrity using Transact-SQL (MS SQL Server)


When using Transact-SQL, the referential integrity is set when creating the Order
table with the FK. Listed below are the statements showing the FK in the Order
table referencing the PK in the Customer table.
CREATE TABLE Customer
( CustID INTEGER PRIMARY KEY,
CustName CHAR(35) )
CREATE TABLE Orders
( OrderID INTEGER PRIMARY KEY,
CustID INTEGER REFERENCES Customer(CustID),
OrderDate DATETIME )

Foreign key rules


Additional foreign key rules may be added when setting referential integrity, such
as what to do with the child rows (in the Orders table) when the record with the PK,
part of the parent (Customer), is deleted or changed (updated). For example, the
Edit Relationships window in MS Access (see Figure 9.1) shows two additional
options for FK rules: Cascade Update and Cascade Delete. If these are not selected,
the system will prevent the deletion or update of PK values in the parent table
(Customer table) if a child record exists. The child record is any record with a
matching PK.

In some databases, an additional option exists when selecting the Delete option
called Set to Null. In this is chosen, the PK row is deleted, but the FK in the child
table is set to NULL. Though this creates an orphan row, it is acceptable.
Enterprise Constraints
Enterprise constraints – sometimes referred to as semantic constraints – are
additional rules specified by users or database administrators and can be based on
multiple tables.

Here are some examples.

• A class can have a maximum of 30 students.


• A teacher can teach a maximum of four classes per semester.
• An employee cannot take part in more than five projects.
• The salary of an employee cannot exceed the salary of the employee’s manager.

Business Rules
Business rules are obtained from users when gathering requirements. The
requirements-gathering process is very important, and its results should be verified
by the user before the database design is built. If the business rules are incorrect,
the design will be incorrect, and ultimately the application built will not function as
expected by the users.

Some examples of business rules are:

• A teacher can teach many students.


• A class can have a maximum of 35 students.
• A course can be taught many times, but by only one instructor.
• Not all teachers teach classes.

Connectivity and Cardinality


By connectivity we mean how many instances of one entity are associated with how
many instances of other entity in a relationship. Cardinality is used to specify such
connectivity. The connectivity of a relationship describes the mapping of associated
entity instances in the relationship. The values of connectivity are "one" or "many".
The cardinality of a relationship is the actual number of related occurrences for each of
the two entities. The basic types of connectivity for relations are: one-to-one, one-to-
many, and many-tomany.

A one-to-one (1:1) relationship is when at most one instance of an entity A is


associated with one instance of entity B. For example, take the relationship between
board members and offices, where each office is held by one member and no member
may hold more than one office.
A one-to-many (1:N) relationship is when for one instance of entity A, there are zero,
one, or many instances of entity B but for one instance of entity B, there is only one
instance of entity A. An example of a 1:N relationships is

a department has many employees;

each employee is assigned to one department.

A many-to-many (M:N) relationship, sometimes called non-specific, is when for one


instance of entity A, there are zero, one, or many instances of entity B and for one
instance of entity B there are zero, one, or many instances of entity A. An example is
employees may be assigned to no more than three projects at a time; every project has
at least two employees assigned to it.

Cardinality
Cardinality refers to the relationship between a row of one table and a row of
another table. The only two options for cardinality are one or many.

Example: Think of a credit card company that has two tables: a table for the person
who gets the card and a table for the card itself. A row from the card_holder table
would have a relationship with a row in the card table because the card_holder
holds the card.

Here is an example of a one to one relationship.


Here you can see the previous example updated with a one to many relationship.

If a card_holder can have only one card this would be a one to one relationship. If
the card holder can have multiple cards it would be a one to many relationship:
Here you can see the previous example updated with a one to many relationship.

In a one to one relationship we have a connection from one row of the first table to
one row of another table. In a one to many relationship we have a connection from
one row of the first table to one or multiple rows of the other table.

We can illustrate that using crow’s foot notation:

One is illustrated with a vertical line., Many is illustrated with 3 lines. Think of
many as a crow’s foot!

Modality
As cardinality is the maximum number of connections between table rows (either
one or many), modality is the least number of row connections! Modality also only
has two options, 0 being the least or 1 being the least.

Another way to think of this is not required or required. If we have a modality of at


least zero, there doesn’t have to be a connection at all (nullable). If we have a
modality of at least one, then we have to have that connection (not null).

This can be a bit confusing so let’s look at the credit card company example:

Here you can see the previous example updated with a one to many relationship.

In the card table we have a column called holder_id. This has a connection to the
card_holder table because a credit card is owned by somebody who will use
it…make sense? Now, if the modality is 0 or more, than we can have a row without
a holder_id value. If the modality is 1 or more, than we must have a value in the
holder_id column.

Modality of 0 or more = A card is not required to be held. This means that the table
not only holds active cards, but it may hold cards that are not currently held by
anybody. This is called a nullable column because it accepts an empty field.

Modality of 1 or more = A card is required to be held. This means that a card with
no owner would not be able to be put in this table. This is said to be a not
null column because it does not accept nulls.

We can also draw modality in our crow’s foot notation. If the modality is zero or
more, we put a little circle right beside the cardinality. If the modality is one or
more, we put a vertical line next to the cardinality:
Here is an illustration of modality. Notice that we can put both cardinality and
modality on the same line. Modality goes on the inside and cardinality goes on the
outside.

Modality goes on the inside, cardinality on the outside.

Cardinality and modality together gives us four possible combinations:

Here are all of our possibilities.

Let’s apply this to the credit card example:


Here are all of our options. The only thing to to take into consideration is the fact
the we call the person a card holder even if he or she doesn’t currently have a credit
card. It may be best to either change the title “card holder” to “customer” or only
use “at least one” to ensure that a a card holder is currently holding a card.

This can go both ways. Let’s look at an example of a person taking a class in a
college. Here are our rules: A class has to have at least 1 person but is not limited to
just one, and a person can take at least zero classes but is not limited to just one:

Here you can see how we can put both examples on a single line to complete the
relationship diagram.

Conclusion
• Cardinality is the max.
• Cardinality is one or many.
• One is illustrated with a vertical line.
• Many is illustrated with a crow’s foot.
• Modality is the least.
• Modality is 0 or 1.
• 0 is a circle
• 1 is a vertical line.
• Modality goes on the inside of the line while cardinality goes on the outside.
• Modality illustrates whether a column is null or not null.

Relationship Types
The line that connects two tables, in an ERD, indicates the relationship
type between the tables: either identifying or non-identifying. An identifying
relationship will have a solid line (where the PK contains the FK). A non-
identifying relationship is indicated by a broken line and does not contain the FK in
the PK. See the section in Chapter 8 that discusses weak and strong relationships
for more explanation.

Figure 9.5. Identifying and non-identifying relationship, by A. Watt.

Optional relationships
In an optional relationship, the FK can be null or the parent table does not need to
have a corresponding child table occurrence. The symbol, shown in Figure 9.6,
illustrates one type with zero and three prongs (indicating many) which is
interpreted as zero OR many.

Figure 9.6.

For example, if you look at the Order table on the right-hand side of Figure 9.7,
you’ll notice that a customer doesn’t need to place an order to be a customer. In
other words, the many side is optional.

Figure 9.7. Example usage of a zero to many optional relationship symbols, by A. Watt.

The relationship symbol in Figure 9.7 can also be read as follows:

• Left side: The order entity must contain a minimum of one related entity in the
Customer table and a maximum of one related entity.
• Right side: A customer can place a minimum of zero orders or a maximum of many
orders.

Figure 9.8 shows another type of optional relationship symbol with a zero and one,
meaning zero OR one. The one side is optional.

Figure 9.8.

Figure 9.9 gives an example of how a zero to one symbol might be used.
Figure 9.9. Example usage of a zero to one optional relationship symbol, by A. Watt.

Mandatory relationships
In a mandatory relationship, one entity occurrence requires a corresponding entity
occurrence. The symbol for this relationship shows one and only one as shown in
Figure 9.10. The one side is mandatory.

Figure 9.10

See Figure 9.11 for an example of how the one and only one mandatory symbol is
used.

Figure 9.11. Example of a one and only one mandatory relationship symbol, by A. Watt.

Figure 9.12 illustrates what a one to many relationship symbols looks like where
the many side is mandatory.
Figure 9.12.

Refer to Figure 9.13 for an example of how the one to many symbols may be used.

Figure 9.13. Example of a one to many mandatory relationship symbols, by A. Watt.

So far we have seen that the innermost side of a relationship symbol (on the left-
side of the symbol in Figure 9.14) can have a 0 (zero) cardinality and a connectivity
of many (shown on the right-side of the symbol in Figure 9.14), or one (not
shown).

Figure 9.14

However, it cannot have a connectivity of 0 (zero), as displayed in Figure 9.15. The


connectivity can only be 1.

Figure 9.15.

The connectivity symbols show maximums. So, if you think about it logically, if
the connectivity symbol on the left side shows 0 (zero), then there would be no
connection between the tables.

The way to read a relationship symbol, such as the one in Figure 9.16, is as follows.

• The CustID in the Order table must also be found in the Customer table a minimum
of 0 and a maximum of 1 time.
• The 0 means that the CustID in the Order table may be null.
• The left-most 1 (right before the 0 representing connectivity) says that if there is a
CustID in the Order table, it can only be in the Customer table once.
• When you see the 0 symbol for cardinality, you can assume two things: T
1. the FK in the Order table allows nulls, and
2. the FK is not part of the PK since PKs must not contain null values.

Figure 9.16. The relationship between a Customer table and an Order table, by A. Watt.

Key Terms
business rules: obtained from users when gathering requirements and are used to
determine cardinality

cardinality: expresses the minimum and maximum number of entity occurrences


associated with one occurrence of a related entity

connectivity: the relationship between two tables, e.g., one to one or one to many

constraints: the rules that force DBMSs to check that data satisfies the semantics

entity integrity: requires that every table have a primary key; neither the primary
key, nor any part of it, can contain null values

identifying relationship: where the primary key contains the foreign key;
indicated in an ERD by a solid line

integrity constraints: logical statements that state what data values are or are not
allowed and which format is suitable for an attribute

mandatory relationship: one entity occurrence requires a corresponding entity


occurrence.
non-identifying relationship: does not contain the foreign key in the primary key;
indicated in an ERD by a dotted line

optional relationship: the FK can be null or the parent table does not need to have
a corresponding child table occurrence

orphan record: a record whose foreign key value is not found in the corresponding
entity – the entity where the primary key is located

referential integrity: requires that a foreign key must have a matching primary key
or it must be null

relational database management system (RDBMS): a popular database system


based on the relational model introduced by E. F. Codd of IBM’s San Jose
Research Laboratory

relationship type: the type of relationship between two tables in an ERD (either
identifying or non-identifying); this relationship is indicated by a line drawn
between the two tables.

(Assignment #2 )
Exercises
Read the following description and then answer questions 1-5 at the end.

The swim club database in Figure 9.17 has been designed to hold information about
students who are enrolled in swim classes. The following information is stored:
students, enrollment, swim classes, pools where classes are held, instructors for the
classes, and various levels of swim classes. Use Figure 9.17 to answer questions 1
to 5.

Figure 9.17. ERD for questions 1-5. (Diagram by A. Watt.)


The primary keys are identified below. The following data types are defined in the
SQL Server.

tblLevels
Level – Identity PK
ClassName – text 20 – nulls are not allowed

tblPool
Pool – Identity PK
PoolName – text 20 – nulls are not allowed
Location – text 30

tblStaff
StaffID – Identity PK
FirstName – text 20
MiddleInitial – text 3
LastName – text 30
Suffix – text 3
Salaried – Bit
PayAmount – money

tblClasses
LessonIndex – Identity PK
Level – Integer FK
SectionID – Integer
Semester – TinyInt
Days – text 20
Time – datetime (formatted for time)
Pool – Integer FK
Instructor – Integer FK
Limit – TinyInt
Enrolled – TinyInt
Price – money

tblEnrollment
LessonIndex – Integer FK
SID – Integer FK (LessonIndex and SID) Primary Key
Status – text 30
Charged – bit
AmountPaid – money
DateEnrolled – datetime

tblStudents
SID – Identity PK
FirstName – text 20
MiddleInitial – text 3
LastName – text 30
Suffix – text 3
Birthday – datetime
LocalStreet – text 30
LocalCity – text 20
LocalPostalCode – text 6
LocalPhone – text 10

Implement this schema in MS access (you will need to pick comparable data
types). Submit a screenshot of your ERD in the database.

1. Explain the relationship rules for each relationship (e.g., tblEnrollment and
tblStudents: A student can enroll in many classes).
2. Identify cardinality for each relationship, assuming the following rules:
o A pool may or may not ever have a class.
o The levels table must always be associated with at least one class.
o The staff table may not have ever taught a class.
o All students must be enrolled in at least one class.
o The class must have students enrolled in it.
o The class must have a valid pool.
o The class may not have an instructor assigned.
o The class must always be associated with an existing level.
3. Which tables are weak and which tables are strong?
4. Which of the tables are non-identifying and which are identifying?

You might also like