0% found this document useful (0 votes)
73 views21 pages

ER Diagram Representation

The document summarizes how ER diagrams represent entities, attributes, relationships, and cardinalities in an ER model. Entities are represented by rectangles, attributes by ellipses, relationships by diamonds, and cardinalities use labels like 1:1, 1:N, N:1, N:N to indicate the number of instances of each entity that can be associated with the relationship. Generalization and specialization are also discussed to show how entities can be grouped at different levels of abstraction.

Uploaded by

Owen Luz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
73 views21 pages

ER Diagram Representation

The document summarizes how ER diagrams represent entities, attributes, relationships, and cardinalities in an ER model. Entities are represented by rectangles, attributes by ellipses, relationships by diamonds, and cardinalities use labels like 1:1, 1:N, N:1, N:N to indicate the number of instances of each entity that can be associated with the relationship. Generalization and specialization are also discussed to show how entities can be grouped at different levels of abstraction.

Uploaded by

Owen Luz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

ER Diagram Representation

Let us now learn how the ER Model is represented by means of an ER diagram. Any object, for example,
entities, attributes of an entity, relationship sets, and attributes of relationship sets, can be represented
with the help of an ER diagram.
Entity
Entities are represented by means of rectangles. Rectangles are named with the entity set they represent.

Attributes
Attributes are the properties of entities. Attributes are represented by means of ellipses. Every ellipse
represents one attribute and is directly connected to its entity (rectangle).

If the attributes are composite, they are further divided in a tree like structure. Every node is then
connected to its attribute. That is, composite attributes are represented by ellipses that are connected with
an ellipse.

Multivalued attributes are depicted by double ellipse.


Derived attributes are depicted by dashed ellipse.

Relationship
Relationships are represented by diamond-shaped box. Name of the relationship is written inside the
diamond-box. All the entities (rectangles) participating in a relationship, are connected to it by a line.
Binary Relationship and Cardinality
A relationship where two entities are participating is called a binary relationship. Cardinality is the number
of instance of an entity from a relation that can be associated with the relation.
 One-to-one − When only one instance of an entity is associated with the relationship, it is marked
as '1:1'. The following image reflects that only one instance of each entity should be associated
with the relationship. It depicts one-to-one relationship.

 One-to-many − When more than one instance of an entity is associated with a relationship, it is
marked as '1:N'. The following image reflects that only one instance of entity on the left and more
than one instance of an entity on the right can be associated with the relationship. It depicts one-
to-many relationship.

 Many-to-one − When more than one instance of entity is associated with the relationship, it is
marked as 'N:1'. The following image reflects that more than one instance of an entity on the left
and only one instance of an entity on the right can be associated with the relationship. It depicts
many-to-one relationship.

 Many-to-many − The following image reflects that more than one instance of an entity on the left
and more than one instance of an entity on the right can be associated with the relationship. It
depicts many-to-many relationship.
Participation Constraints
 Total Participation − Each entity is involved in the relationship. Total participation is represented by
double lines.
 Partial participation − Not all entities are involved in the relationship. Partial participation is
represented by single lines.

Generalization Aggregation
The ER Model has the power of expressing database entities in a conceptual hierarchical manner. As the
hierarchy goes up, it generalizes the view of entities, and as we go deep in the hierarchy, it gives us the
detail of every entity included.
Going up in this structure is called generalization, where entities are clubbed together to represent a more
generalized view. For example, a particular student named Mira can be generalized along with all the
students. The entity shall be a student, and further, the student is a person. The reverse is
called specialization where a person is a student, and that student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the
properties of all the generalized entities, is called generalization. In generalization, a number of entities are
brought together into one generalized entity based on their similar characteristics. For example, pigeon,
house sparrow, crow and dove can all be generalized as Birds.
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into sub-
groups based on their characteristics. Take a group ‘Person’ for example. A person has name, date of
birth, gender, etc. These properties are common in all persons, human beings. But in a company, persons
can be identified as employee, employer, customer, or vendor, based on what role they play in the
company.

Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on what
role they play in school as entities.
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known
as abstraction.
Inheritance is an important feature of Generalization and Specialization. It allows lower-level entities to
inherit the attributes of higher-level entities.
For example, the attributes of a Person class such as name, age, and gender can be inherited by lower-
level entities such as Student or Teacher.

Codd's 12 Rules
Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems, came up
with twelve rules of his own, which according to him, a database must obey in order to be regarded as a
true relational database.
These rules can be applied on any database system that manages stored data using only its relational
capabilities. This is a foundation rule, which acts as a base for all the other rules.
Rule 1: Information Rule
The data stored in a database, may it be user data or metadata, must be a value of some table cell.
Everything in a database must be stored in a table format.
Rule 2: Guaranteed Access Rule
Every single data element (value) is guaranteed to be accessible logically with a combination of table-
name, primary-key (row value), and attribute-name (column value). No other means, such as pointers, can
be used to access data.
Rule 3: Systematic Treatment of NULL Values
The NULL values in a database must be given a systematic and uniform treatment. This is a very
important rule because a NULL can be interpreted as one the following − data is missing, data is not
known, or data is not applicable.
Rule 4: Active Online Catalog
The structure description of the entire database must be stored in an online catalog, known as data
dictionary, which can be accessed by authorized users. Users can use the same query language to
access the catalog which they use to access the database itself.
Rule 5: Comprehensive Data Sub-Language Rule
A database can only be accessed using a language having linear syntax that supports data definition, data
manipulation, and transaction management operations. This language can be used directly or by means of
some application. If the database allows access to data without any help of this language, then it is
considered as a violation.
Rule 6: View Updating Rule
All the views of a database, which can theoretically be updated, must also be updatable by the system.
Rule 7: High-Level Insert, Update, and Delete Rule
A database must support high-level insertion, updation, and deletion. This must not be limited to a single
row, that is, it must also support union, intersection and minus operations to yield sets of data records.
Rule 8: Physical Data Independence
The data stored in a database must be independent of the applications that access the database. Any
change in the physical structure of a database must not have any impact on how the data is being
accessed by external applications.
Rule 9: Logical Data Independence
The logical data in a database must be independent of its user’s view (application). Any change in logical
data must not affect the applications using it. For example, if two tables are merged or one is split into two
different tables, there should be no impact or change on the user application. This is one of the most
difficult rule to apply.
Rule 10: Integrity Independence
A database must be independent of the application that uses it. All its integrity constraints can be
independently modified without the need of any change in the application. This rule makes a database
independent of the front-end application and its interface.
Rule 11: Distribution Independence
The end-user must not be able to see that the data is distributed over various locations. Users should
always get the impression that the data is located at one site only. This rule has been regarded as the
foundation of distributed database systems.
Rule 12: Non-Subversion Rule
If a system has an interface that provides access to low-level records, then the interface must not be able
to subvert the system and bypass security and integrity constraints.

Relation Data Model


Relational data model is the primary data model, which is used widely around the world for data storage
and processing. This model is simple and it has all the properties and capabilities required to process data
with storage efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This format stores the
relation among entities. A table has rows and columns, where rows represents records and columns
represent the attributes.
Tuple − A single row of a table, which contains a single record for that relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system represents relation instance.
Relation instances do not have duplicate tuples.
Relation schema − A relation schema describes the relation name (table name), attributes, and their
names.
Relation key − Each row has one or more attributes, known as relation key, which can identify the row in
the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope, known as attribute domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation. These conditions are
called Relational Integrity Constraints. There are three main integrity constraints −

 Key constraints
 Domain constraints
 Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely.
This minimal subset of attributes is called key for that relation. If there are more than one such minimal
subsets, these are called candidate keys.
Key constraints force that −
 in a relation with a key attribute, no two tuples can have identical values for key attributes.
 a key attribute cannot have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Attributes have specific values in real-world scenario. For example, age can only be a positive integer.
The same constraints have been tried to employ on the attributes of a relation. Every attribute is bound to
have a specific range of values. For example, age cannot be less than zero and telephone numbers
cannot contain a digit outside 0-9.
Referential integrity Constraints
Referential integrity constraints work on the concept of Foreign Keys. A foreign key is a key attribute of a
relation that can be referred in other relation.
Referential integrity constraint states that if a relation refers to a key attribute of a different or same
relation, then that key element must exist.
ER Model to Relational Model
ER Model, when conceptualized into diagrams, gives a good overview of entity-relationship, which is
easier to understand. ER diagrams can be mapped to relational schema, that is, it is possible to create
relational schema using ER diagram. We cannot import all the ER constraints into relational model, but an
approximate schema can be generated.
There are several processes and algorithms available to convert ER Diagrams into Relational Schema.
Some of them are automated and some of them are manual. We may focus here on the mapping diagram
contents to relational basics.
ER diagrams mainly comprise of −

 Entity and its attributes


 Relationship, which is association among entities.
Mapping Entity
An entity is a real-world object with some attributes.

Mapping Process (Algorithm)

 Create table for each entity.


 Entity's attributes should become fields of tables with their respective data types.
 Declare primary key.
Mapping Relationship
A relationship is an association among entities.

Mapping Process

 Create table for a relationship.


 Add the primary keys of all participating Entities as fields of table with their respective data types.
 If relationship has any attribute, add each attribute as field of table.
 Declare a primary key composing all the primary keys of participating entities.
 Declare all foreign key constraints.
Mapping Weak Entity Sets
A weak entity set is one which does not have any primary key associated with it.

Mapping Process

 Create table for weak entity set.


 Add all its attributes to table as field.
 Add the primary key of identifying entity set.
 Declare all foreign key constraints.
Mapping Hierarchical Entities
ER specialization or generalization comes in the form of hierarchical entity sets.

Mapping Process
 Create tables for all higher-level entities.
 Create tables for lower-level entities.
 Add primary keys of higher-level entities in the table of lower-level entities.
 In lower-level tables, add all other attributes of lower-level entities.
 Declare primary key of higher-level table and the primary key for lower-level table.
 Declare foreign key constraints.

SQL Overview
SQL is a programming language for Relational Databases. It is designed over relational algebra and tuple
relational calculus. SQL comes as a package with all major distributions of RDBMS.
SQL comprises both data definition and data manipulation languages. Using the data definition properties
of SQL, one can design and modify database schema, whereas data manipulation properties allows SQL
to store and retrieve data from database.
Data Definition Language
SQL uses the following set of commands to define database schema −
CREATE
Creates new databases, tables and views from RDBMS.
For example −
Create database dbms;
Create table article;
Create view for_students;
DROP
Drops commands, views, tables, and databases from RDBMS.
For example−
Drop object_type object_name;
Drop database dbms;
Drop table article;
Drop view for_students;
ALTER
Modifies database schema.
Alter object_type object_name parameters;
For example−
Alter table article add subject varchar;
This command adds an attribute in the relation article with the name subject of string type.
Data Manipulation Language
SQL is equipped with data manipulation language (DML). DML modifies the database instance by
inserting, updating and deleting its data. DML is responsible for all forms data modification in a database.
SQL contains the following set of commands in its DML section −

 SELECT/FROM/WHERE
 INSERT INTO/VALUES
 UPDATE/SET/WHERE
 DELETE FROM/WHERE
These basic constructs allow database programmers and users to enter data and information into the
database and retrieve efficiently using a number of filter options.
SELECT/FROM/WHERE
 SELECT − This is one of the fundamental query command of SQL. It is similar to the projection
operation of relational algebra. It selects the attributes based on the condition described by
WHERE clause.
 FROM − This clause takes a relation name as an argument from which attributes are to be
selected/projected. In case more than one relation names are given, this clause corresponds to
Cartesian product.
 WHERE − This clause defines predicate or conditions, which must match in order to qualify the
attributes to be projected.
For example −
Select author_name
From book_author
Where age > 50;
This command will yield the names of authors from the relation book_author whose age is greater than 50.
INSERT INTO/VALUES
This command is used for inserting values into the rows of a table (relation).
Syntax−
INSERT INTO table (column1 [, column2, column3 ... ]) VALUES (value1 [, value2, value3 ... ])
Or
INSERT INTO table VALUES (value1, [value2, ... ])
For example −
INSERT INTO dbms (Author, Subject) VALUES ("anonymous", "computers");
UPDATE/SET/WHERE
This command is used for updating or modifying the values of columns in a table (relation).
Syntax −
UPDATE table_name SET column_name = value [, column_name = value ...] [WHERE condition]
For example −
UPDATE dbms SET Author="webmaster" WHERE Author="anonymous";
DELETE/FROM/WHERE
This command is used for removing one or more rows from a table (relation).
Syntax −
DELETE FROM table_name [WHERE condition];
For example −
DELETE FROM dbms
WHERE Author="unknown";
DBMS - Normalization

Functional Dependency
Functional dependency (FD) is a set of constraints between two attributes in a relation. Functional
dependency says that if two tuples have same values for attributes A1, A2,..., An, then those two tuples
must have to have same values for attributes B1, B2, ..., Bn.
Functional dependency is represented by an arrow sign (→) that is, X→Y, where X functionally
determines Y. The left-hand side attributes determine the values of attributes on the right-hand side.
Armstrong's Axioms
If F is a set of functional dependencies then the closure of F, denoted as F +, is the set of all functional
dependencies logically implied by F. Armstrong's Axioms are a set of rules, that when applied repeatedly,
generates a closure of functional dependencies.
 Reflexive rule − If alpha is a set of attributes and beta is_subset_of alpha, then alpha holds beta.
 Augmentation rule − If a → b holds and y is attribute set, then ay → by also holds. That is adding
attributes in dependencies, does not change the basic dependencies.
 Transitivity rule − Same as transitive rule in algebra, if a → b holds and b → c holds, then a → c
also holds. a → b is called as a functionally that determines b.
Trivial Functional Dependency
 Trivial − If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a
trivial FD. Trivial FDs always hold.
 Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD.
 Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is said to be a
completely non-trivial FD.
Normalization
If a database design is not perfect, it may contain anomalies, which are like a bad dream for any database
administrator. Managing a database with anomalies is next to impossible.
 Update anomalies − If data items are scattered and are not linked to each other properly, then it
could lead to strange situations. For example, when we try to update one data item having its
copies scattered over several places, a few instances get updated properly while a few others are
left with old values. Such instances leave the database in an inconsistent state.
 Deletion anomalies − We tried to delete a record, but parts of it was left undeleted because of
unawareness, the data is also saved somewhere else.
 Insert anomalies − We tried to insert data in a record that does not exist at all.
Normalization is a method to remove all these anomalies and bring the database to a consistent state.
First Normal Form
First Normal Form is defined in the definition of relations (tables) itself. This rule defines that all the
attributes in a relation must have atomic domains. The values in an atomic domain are indivisible units.

We re-arrange the relation (table) as below, to convert it to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.
Second Normal Form
Before we learn about the second normal form, we need to understand the following −
 Prime attribute − An attribute, which is a part of the candidate-key, is known as a prime attribute.
 Non-prime attribute − An attribute, which is not a part of the prime-key, is said to be a non-prime
attribute.
If we follow second normal form, then every non-prime attribute should be fully functionally dependent on
prime key attribute. That is, if X → A holds, then there should not be any proper subset Y of X, for which Y
→ A also holds true.
We see here in Student_Project relation that the prime key attributes are Stu_ID and Proj_ID. According to
the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both and not on any
of the prime key attribute individually. But we find that Stu_Name can be identified by Stu_ID and
Proj_Name can be identified by Proj_ID independently. This is called partial dependency, which is not
allowed in Second Normal Form.

We broke the relation in two as depicted in the above picture. So there exists no partial dependency.
Third Normal Form
For a relation to be in Third Normal Form, it must be in Second Normal form and the following must satisfy

 No non-prime attribute is transitively dependent on prime key attribute.


 For any non-trivial functional dependency, X → A, then either −
o X is a superkey or,
o A is prime attribute.

We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute. We find
that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a superkey nor is City a prime
attribute. Additionally, Stu_ID → Zip → City, so there exists transitive dependency.
To bring this relation into third normal form, we break the relation into two relations as follows −
Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF states that

 For any non-trivial functional dependency, X → A, X must be a super-key.


In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the super-key in the
relation ZipCodes. So,
Stu_ID → Stu_Name, Zip
and
Zip → City
Which confirms that both the relations are in BCNF.
DBMS - Joins

We understand the benefits of taking a Cartesian product of two relations, which gives us all the possible
tuples that are paired together. But it might not be feasible for us in certain cases to take a Cartesian
product where we encounter huge relations with thousands of tuples having a considerable large number
of attributes.
Join is a combination of a Cartesian product followed by a selection process. A Join operation pairs two
tuples from different relations, if and only if a given join condition is satisfied.
We will briefly describe various join types in the following sections.
Theta (θ) Join
Theta join combines tuples from different relations provided they satisfy the theta condition. The join
condition is denoted by the symbol θ.
Notation
R1 ⋈θ R2
R1 and R2 are relations having attributes (A1, A2, .., An) and (B1, B2,.. ,Bn) such that the attributes don’t
have anything in common, that is R1 ∩ R2 = Φ.
Theta join can use all kinds of comparison operators.

Student

SID Name Std

101 Alex 10

102 Maria 11

Subjects

Class Subject

10 Math

10 English

11 Music

11 Sports

Student_Detail −
STUDENT ⋈Student.Std = Subject.Class SUBJECT
Student_detail

SID Name Std Class Subject

101 Alex 10 10 Math

101 Alex 10 10 English

102 Maria 11 11 Music

102 Maria 11 11 Sports

Equijoin
When Theta join uses only equality comparison operator, it is said to be equijoin. The above example
corresponds to equijoin.
Natural Join (⋈)
Natural join does not use any comparison operator. It does not concatenate the way a Cartesian product
does. We can perform a Natural Join only if there is at least one common attribute that exists between two
relations. In addition, the attributes must have the same name and domain.
Natural join acts on those matching attributes where the values of attributes in both the relations are
same.

Courses

CID Course Dept

CS01 Database CS

ME01 Mechanics ME

EE01 Electronics EE

HoD

Dept Head

CS Alex

ME Maya

EE Mira

Courses ⋈ HoD

Dept CID Course Head

CS CS01 Database Alex

ME ME01 Mechanics Maya

EE EE01 Electronics Mira

Outer Joins
Theta Join, Equijoin, and Natural Join are called inner joins. An inner join includes only those tuples with
matching attributes and the rest are discarded in the resulting relation. Therefore, we need to use outer
joins to include all the tuples from the participating relations in the resulting relation. There are three kinds
of outer joins − left outer join, right outer join, and full outer join.

Left Outer Join(R   S)


All the tuples from the Left relation, R, are included in the resulting relation. If there are tuples in R without
any matching tuple in the Right relation S, then the S-attributes of the resulting relation are made NULL.

Left

A B

100 Database

101 Mechanics

102 Electronics

Right

A B

100 Alex

102 Maya

104 Mira

Courses   HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya

Right Outer Join: ( R   S )


All the tuples from the Right relation, S, are included in the resulting relation. If there are tuples in S
without any matching tuple in R, then the R-attributes of resulting relation are made NULL.
Courses   HoD

A B C D

100 Database 100 Alex

102 Electronics 102 Maya

--- --- 104 Mira

Full Outer Join: ( R   S)


All the tuples from both participating relations are included in the resulting relation. If there are no
matching tuples for both relations, their respective unmatched attributes are made NULL.

Courses   HoD

A B C D

100 Database 100 Alex

101 Mechanics --- ---

102 Electronics 102 Maya

--- --- 104 Mira

You might also like