Unit 2
Unit 2
Unit 2
Data Modelling
Contents
• Data Modeling Using the E-R Model: Entity Types, Entity Sets,
Attributes, and Keys, Relationships, relationship Types, Roles, and
Structural Constraints, Weak Entity Types, Refining the ER Design for
the COMPANY Database, ER Diagrams, Naming Conventions, and
Design Issues
• The Relational Data Model: Relational Model Concepts, Relational
Constraints and Relational Database Schemas, Update Operations and
Dealing with Constraint Violations, Basic Relational Algebra
Operations, Additional Relational Operations
Entity and entity types
Entity
An entity is a real-world thing which can be distinctly identified like a person,
place or a concept. It is an object which is distinguishable from others. If we
cannot distinguish it from others then it is an object but not an entity. An
entity can be of two types:
• Tangible Entity: Tangible Entities are those entities which exist in the real
world physically. Example: Person, car, etc.
• Intangible Entity: Intangible Entities are those entities which exist only
logically and have no physical existence. Example: Bank Account, etc.
• Example: If we have a table of a Student (Roll_no, Student_name, Age,
Mobile_no) then each student in that table is an entity and can be uniquely
identified by their Roll Number i.e Roll_no.
Entity and entity types
Entity Type
• The entity type is a collection of entities having similar attributes.
• In the above example, Student table example, we have each row as
an entity and they are having common attributes i.e each row has its
own value for attributes Roll_no, Age, Student_name and Mobile_no.
• So, we can define the above STUDENT table as an entity type because
it is a collection of entities having the same attributes.
• The table below shows how the data of different entities( different
students) are stored.
Entity and entity types
Entity Type
• Consider the example below;
Entity and entity types
Types of Entity types
• Strong Entity Type: Strong entities are those entity types which have a key
attribute. The primary key helps in identifying each entity uniquely. In the above
example, Roll_no identifies each element of the table uniquely and hence, we can
say that STUDENT is a strong entity type. Represented by a rectangle.
• Weak Entity Type: Weak entity types do not have a key attribute. Weak entity
type can't be identified on its own. It depends upon some other strong entity for
its distinct identity. For example, there can be children only if the parent exits.
There can be no independent existence of children. There can be a room only if
building exits. There can be no independent existence of a room. A weak entity is
represented by a double outlined rectangle. The relationship between a weak
entity type and strong entity type is called an identifying relationship and shown
with a double outlined diamond instead of a single outlined diamond.
Entity and entity types
Entity Set
• Entity Set is a collection of entities of the same entity type. In the
above example of STUDENT entity type, a collection of entities from
the Student entity type would form an entity set. We can say that
entity type is a superset of the entity set as all the entities are
included in the entity type. Let's try to understand this with the help
of an example.
• Example 1: In the below example, two entities E1 (2, Angel, 19,
8709054568) and E2(4, Analisa, 21, 9847852156) form an entity set.
Entity and entity types
Entity Set
• Example
Attributes
• An attribute describes the facts, details or characteristics of an entity.
• An attribute is a characteristic. An attribute refers to a database
component, such as a table. It also may refer to a database field.
Database Keys
• A DBMS key is an attribute or set of an attribute which helps you to
identify a row(tuple) in a relation(table). They allow you to find the
relation between two tables. Keys help you uniquely identify a row in
a table by a combination of one or more columns in that table.
• In the above-given example, employee ID is a primary key because it
uniquely identifies an employee record. In this table, no other
employee can have the same employee ID.
Database Keys
Why do we need keys?
• Keys help you to identify any row of data in a table. In a real-world
application, a table could contain thousands of records. Moreover,
the records could be duplicated. Keys ensure that you can uniquely
identify a table record despite these challenges.
• Allows you to establish a relationship between and identify the
relation between tables
• Help you to enforce identity and integrity in the relationship.
Database Keys
Types of keys
• Super Key
• Primary Key
• Candidate Key
• Alternate Key
• Foreign Key
• Compound Key
• Composite Key
• Surrogate Key
Database Keys
Super key
• A superkey is a group of single or multiple keys which identifies rows
in a table. A Super key may have additional attributes that are not
needed for unique identification.
• In the above-given example, EmpSSN and EmpNum name are
superkeys.
Database Keys
Primary key
• PRIMARY KEY is a column or group of columns in a table that uniquely
identify every row in that table. The Primary Key can't be a duplicate
meaning, the same value can't appear more than once in the table. A table
cannot have more than one primary key.
Rules for defining Primary Keys
• Two rows can't have the same primary key value
• Its a must for every row to have a primary key value.
• The primary key field cannot be null.
• The value in a primary key column can never be modified or updated if any
foreign key refers to that primary key.
Database Keys
Alternate key
• ALTERNATE KEYS is a column or group of columns in a table that
uniquely identify every row in that table. A table can have multiple
choices for a primary key but only one can be set as the primary key.
All the keys which are not primary key are called an Alternate Key.
• In this table, StudID, Roll No, Email are qualified to become a primary
key. But since StudID is the primary key, Roll No & Email become the
alternative keys.
Database Keys
Candidate key
• CANDIDATE KEY is a set of attributes that uniquely identify tuples in a
table. Candidate Key is a super key with no repeated attributes. The
Primary key should be selected from the candidate keys. Every table must
have at least a single candidate key. A table can have multiple candidate
keys but only a single primary key.
Properties of Candidate key
• It must contain unique values
• Candidate key may have multiple attributes
• Must not contain null values
• It should contain minimum fields to ensure uniqueness
• Uniquely identify each record in a table
Database Keys
Candidate key
• Example: In the given table Stud ID, Roll No, and email are candidate
keys which help us to uniquely identify the student record in the
table.
Database Keys
Foreign key
• FOREIGN KEY is a column that creates a relationship between two tables.
The purpose of Foreign keys is to maintain data integrity and allow
navigation between two different instances of an entity. It acts as a cross-
reference between two tables as it references the primary key of another
table.
• In this example, we have two tables, teach and department in a school.
However, there is no way to see which teacher works in which department.
• In this table, adding the foreign key in Deptcode to the Teacher name, we
can create a relationship between the two tables.
• This concept is also known as Referential Integrity.
Database Keys
Foreign key
• FOREIGN
Database Keys
Compound key
• COMPOUND KEY has two or more attributes that allow you to
uniquely recognize a specific record. It is possible that each column
may not be unique by itself within the database. However, when
combined with the other column or columns the combination of
composite keys become unique. The purpose of compound key is to
uniquely identify each record in the table.
• In this example, OrderNo and ProductID can't be a primary key as it
does not uniquely identify a record. However, a compound key of
Order ID and Product ID could be used as it uniquely identified each
record.
Database Keys
Compound key
• COMPOUND
Database Keys
Composite key
• COMPOSITE KEY is a combination of two or more columns that
uniquely identify rows in a table. The combination of columns
guarantees uniqueness, though individually uniqueness is not
guaranteed. Hence, they are combined to uniquely identify records in
a table.
• The difference between compound and the composite key is that any
part of the compound key can be a foreign key, but the composite
key may or maybe not a part of the foreign key.
Database Keys
Surrogate key
• An artificial key which aims to uniquely identify each record is called a
surrogate key. These kind of keys are unique because they are created
when you don't have any natural primary key. They do not lend any
meaning to the data in the table. Surrogate key is usually an integer.
• Below, given example, shown shift timings of the different employees. In
this example, a surrogate key is needed to uniquely identify each
employee.
Surrogate keys are allowed when
• No property has the parameter of the primary key.
• In the table when the primary key is too big or complicated.
Database Keys
Surrogate key
• Surrogate
Database Keys
Difference between Primary key and Foreign Key
Relationship types
One to One Relationship
• This type of relationship allows only one record on each side of the
relationship. The primary key relates to only one record—or none—in
another table. An employee can work in at most one department, and
a department can have at most one employee.
One to Many Relationship
• A one-to-many relationship allows a single record in one table to be
related to multiple records in another table. An employee can work in
many departments (>=0), but a department can have at most one
employee.
Relationship types
Many to One Relationship
• An employee can work in at most one department (<=1), and a
department can have several employees.
Example 2
σ topic = "Database" and author = "guru99"( Tutorials)
• Output - Selects tuples from Tutorials where the topic is 'Database' and 'author' is guru99.
Example 3
σ sales > 50000 (Customers)
• Output - Selects tuples from Customers where sales is greater than 50000
Basic Relational Algebra Operations
Project Operation (∏)
• It projects column(s) that satisfy a given predicate.
Notation − ∏A1, A2, An (r)
• Where A1, A2 , An are attribute names of relation r.
• Duplicate rows are automatically eliminated, as relation is a set.
• For example −
∏subject, author (Books)
• Output − selects and projects columns named as subject and author
from the relation Books.
Basic Relational Algebra Operations
Projection(π) CustomerID CustomerName Status
• A ∪ B gives; Table A ∪ B
Column 1 Column 2
1 1
1 2
1 3
Basic Relational Algebra Operations
Set Difference (−)
• The result of set difference operation is tuples, which are present in
one relation but are not in the second relation.
Notation − r − s
• Finds all the tuples that are present in r but not in s.
• For example −
∏ author (Books) − ∏ author (Articles)
• Output − Provides the name of authors who have written books but
not articles.
Basic Relational Algebra Operations
Cartesian Product (Χ)
• Combines information of two different relations into one.
Notation − r Χ s
• Where r and s are relations and their output will be defined as −
• r Χ s = { q t | q ∈ r and t ∈ s}
• For example −
σauthor = 'tutorialspoint'(Books Χ Articles)
• Output − Yields a relation, which shows all the books and articles
written by tutorialspoint.
Basic Relational Algebra Operations
Rename Operation (ρ)
• The results of relational algebra are also relations but without any
name. The rename operation allows us to rename the output
relation. 'rename' operation is denoted with small Greek letter rho ρ.
Notation − ρ x (E)
• Where the result of expression E is saved with name of x.
The SELECT operation is used for selecting a subset of the tuples according to a given
Select(σ)
selection condition
Summary Projection(π)
The projection eliminates all attributes of the input relation but those mentioned in the
projection list.
Union Operation(∪) UNION is symbolized by symbol. It includes all tuples that are in tables A or in B.
- Symbol denotes it. The result of A - B, is a relation which includes all tuples that are in A
Set Difference(-)
but not in B.
Intersection(∩) Intersection defines a relation consisting of a set of all tuple that are in both A and B.
Cartesian Product(X) Cartesian operation is helpful to merge columns from two relations.
Inner Join Inner join, includes only those tuples that satisfy the matching criteria.
Theta Join(θ) The general case of JOIN operation is called a Theta join. It is denoted by symbol θ.
EQUI Join When a theta join uses only equivalence condition, it becomes a equi join.
Natural join can only be performed if there is a common attribute (column) between the
Natural Join(⋈)
relations.
Outer Join In an outer join, along with tuples that satisfy the matching criteria.
Left Outer Join( ) In the left outer join, operation allows keeping all tuple in the left relation.
Right Outer join() In the right outer join, operation allows keeping all tuple in the right relation.
In a full outer join, all tuples from both relations are included in the result irrespective of
Full Outer Join()
the matching condition.