Dbms Unit II
Dbms Unit II
UNIT-II
Introduction of Relational Model
❖ In 1970, E.F. Codd developed the relational model. He proposed this model as
well as a non-procedural approach for modeling data in the form of relations or
tables.
❖ In the Relational Model, tables are usually interpreted as relations. If we model
the database using ER diagrams, we must convert them into the relational
model, which can be implemented by one of the RDBMS languages such
as SQL and MySQL.
❖ In the relational model, each table contains rows and columns where we can
have any number of rows, but the number of columns must be definite. The
table rows are called tuples that include the complete information about
specific entities. Records are a set of tuples, so the Relational model is also
known as the Record-based Model. Table columns are called attributes
because they characterize a table's properties. To store values, each attribute
must have a type.
❖ The relational model is today the primary data model for commercial data
processing applications. It attained its primary position because of its simplicity,
which eases the job of the programmer, compared to earlier data models such
as the network model or the hierarchical model. In this, we first study the
fundamentals of the relational model. A substantial theory exists for relational
databases.
Introduction of Relational Model
Structure of Relational Databases:
❖ A relational database consists of a collection of tables, each of which is assigned
a unique name. For example, consider the instructor table of in following
diagram, which stores information about instructors. The table has four column
headers: ID, name, dept name, and salary. Each row of this table records
information about an instructor, consisting of the instructor’s ID, name, dept
name, and salary. Similarly,
❖ The course table of diagram2 stores information about courses, consisting of a
course id, title, dept name, and credits, for each course. Note that each
instructor is identified by the value of the column ID, while each course is
identified by the value of the column course id.
❖ Diagram3 shows a third table, prereq, which stores the prerequisite courses for
each course. The table has two columns, course id and prereq id. Each row
consists of a pair of course identifiers such that the second course is a
prerequisite for the first course.
❖ Thus, a row in the prereq table indicates that two courses are related in the
sense that one course is a prerequisite for the other. As another example, we
consider the table instructor, a row in the table can be thought of as
representing the relationship between a specified ID and the corresponding
values for name, dept name, and salary values
Introduction of Relational Model
Structure of Relational Databases:
Introduction of Relational Model
❖ in general, a row in a table represents a relationship among a set of values.
Since a table is a collection of such relationships, there is a close
correspondence between the concept of table and the mathematical concept
of relation, from which the relational data model takes its name.
❖ In mathematical terminology, a tuple is simply a sequence (or list) of values. A
relationship between n values is represented mathematically by an n-tuple of
values, i.e., a tuple with n values, which corresponds to a row in a table.
Introduction of Relational Model
❖ For example, suppose the table instructor had an attribute phone number, which
can store a set of phone numbers corresponding to the instructor. Then the domain
of phone number would not be atomic, since an element of the domain is a set of
phone numbers, and it has subparts, namely the individual phone numbers in the
set.
❖ The important issue is not what the domain itself is, but rather how we use domain
elements in our database. Suppose now that the phone number attribute stores a
single phone number. Even then, if we split the value from the phone number
attribute into a country code, an area code and a local number, we would be
treating it as a non atomic value. If we treat each phone number as a single
indivisible unit, then the attribute phone number would have an atomic domain.
❖ The null value is a special value that signifies that the value is unknown or does not
exist. For example, suppose as before that we include the attribute phone number
in the instructor relation. It may be that an instructor does not have a phone
number at all, or that the telephone number is unlisted. We would then have to use
the null value to signify that the value is unknown or does not exist. We shall see
later that null values cause a number of difficulties when we access or update the
database, and thus should be eliminated if at all possible. We shall assume null
values are absent initially.
Key Differences between ER Model and Relational Model
❖ The following points explain the main differences between ER Model and
Relational Model:
❖ The main distinction between the ER model and the Relational Model is that the ER
model describes the relationship between entities and their attributes. On the other
hand, the Relational Model referred to the implementation of our model.
❖ The Relational Model is the implementation or representational model, while the ER
Model is the high-level or conceptual model.
❖ The data in components such as entity sets, relationship sets, and attributes are
represented by an ER model. The Relational model, on the other hand, defines data
in components such as tuples, attributes and attributes domains.
❖ As compared to a Relational Model, an ER model makes it easier to understand the
relationships between entities.
❖ Mapping Cardinality is always a constraint in the ER model, while the cardinality
constraint cannot be defined in the Relational Model.
Conclusion:
❖ We have made a comparison between ER Model and Relational Model. We
may conclude that if we convert an E-R model to a Relational model, we must
ensure that each strong entity has its own table.
CONSTRAINTS OVER RELATIONS
❖ A database is only as good as the information stored in it, and a DBMS must
there for help prevent the entry of incorrect information. An integrity constraint
(IC) is a condition that is specified on a database schema, and restricts the data
that can be stored in an instance of the database. If a database instance
satisfies all the integrity constraints specified on the database schema, it is a
legal instance. A DBMS enforces integrity constraints, in that it permits only
legal instances to be stored in the database.
Integrity constraints are specified and enforced at different times:
❖ When the DBA or end user defines a database schema, he or she specifies the
ICs that must hold on any instance of this database.
❖ When a database application is run, the DBMS checks for violations and
disallows changes to the data that violate the specified ICs. (In some situations,
rather than disallow the change, the DBMS might instead make some
compensating changes to the data to ensure that the database instance
satisfies all ICs. In any case, changes to the database are not allowed to create
an instance that violates any IC.)
CONSTRAINTS OVER RELATIONS
Keys
❖ We must have a way to specify how tuples within a given relation are
distinguished. This is expressed in terms of their attributes. That is, the values
of the attribute values of a tuple must be such that they can uniquely identify
the tuple. In other words, no two tuples in a relation are allowed to have
exactly the same value for all attributes.
❖ A super key is a set of one or more attributes that, taken collectively, allow us to
identify uniquely a tuple in the relation. For example, the ID attribute of the
relation instructor is sufficient to distinguish one instructor tuple from another.
Thus, ID is a super key.
❖ The name attribute of instructor, on the other hand, is not a super key, because
several instructors might have the same name. Formally, let R denote the set of
attributes in the schema of relation r. If we say that a subset K of R is a super
key for r , we are restricting consideration to instances of relations r in which no
two distinct tuples have the same values on all attributes in K. That is, if t1 and
t2 are in r and t1 = t2, then t1.K = t2.K.
CONSTRAINTS OVER RELATIONS
Keys
❖ A super key may contain extraneous attributes. For example, the combination of ID
and name is a super key for the relation instructor. If K is a super key, then so is any
superset of K. We are often interested in super keys for which no proper subset is a
super key. Such minimal super keys are called candidate keys.
❖ It is possible that several distinct sets of attributes could serve as a candidate key.
Suppose that a combination of name and dept name is sufficient to distinguish
among members of the instructor relation. Then, both {ID} and{name, dept name}
are candidate keys. Although the attributes ID and name together can distinguish
instructor
❖ Tuples, their combination, {ID, name}, does not form a candidate key, since the
attribute ID alone is a candidate key.
❖ We shall use the term primary key to denote a candidate key that is chosen by the
database designer as the principal means of identifying tuples within a relation. A
key (whether primary, candidate, or super) is a property of the entire relation,
rather than of the individual tuples. Any two individual tuples in the relation are
prohibited from having the same value on the key attributes at the same time. The
designation of a key represents a constraint in the real-world enterprise being
modeled.
CONSTRAINTS OVER RELATIONS
Keys
❖ Primary keys must be chosen with care. As we noted, the name of a person
is obviously not sufficient, because there may be many people with the
same name. In the United States, the social-security number attribute of a
person would be a candidate key. Since non-U.S. residents usually do not
have social-security numbers, international enterprises must generate their
own unique identifiers.
❖ An alternative is to use some unique combination of other attributes as a
key. The primary key should be chosen such that its attribute values are
never, or very rarely, changed. For instance, the address field of a person
should not be part of the primary key, since it is likely to change. Social-
security numbers, on the other hand, are guaranteed never to change.
Unique identifiers generated by enterprises generally do not change, except
if two enterprises merge; in such a case the same identifier may have been
issued by both enterprises, and a reallocation of identifiers may be required
to make sure they are unique.
CONSTRAINTS OVER RELATIONS
Keys
❖ It is customary to list the primary key attributes of a relation schema
before the other attributes; for example, the dept name attribute of
department is listed first, since it is the primary key. Primary key
attributes are also underlined. A relation, say r1, may include among its
attributes the primary key of another relation, say r2. This attribute is
called a foreign key from r1, referencing r2.
❖ The relation r1 is also called the referencing relation of the foreign key
dependency, and r2 is called the referenced relation of the foreign key.
For example, the attribute dept name in instructor is a foreign key from
instructor, referencing department, since dept name is the primary key
of department. In any database instance, given any tuple, say ta, from
the instructor relation, there must be some tuple, say tb, in the
department relation such that the value of the dept name attribute of ta
is the same as the value of the primary key, dept name, of tb.
CONSTRAINTS OVER RELATIONS
Keys
➢ Now consider the section and teaches relations. It would be reasonable
to require that if a section exists for a course, it must be taught by at
least one instructor; however, it could possibly be taught by more than
one instructor. To enforce this constraint, we would require that if a
particular (course id, sec id, semester, year) combination appears in
section, then the same combination must appear in teaches. However,
this set of values does not form a primary key for teaches, since more
than one instructor may teach one such section. As a result, we cannot
declare a foreign key constraint from section to teaches (although we
can define a foreign key
➢ constraint in the other direction, from teaches to section).The constraint
from section to teaches is an example of a referential integrity
constraint; a referential integrity constraint requires that the values
appearing in specified attributes of any tuple in the referencing relation
also appear in specified attributes of at least one tuple in the referenced
relation
CONSTRAINTS OVER RELATIONS
Schema Diagrams
❖ A database schema, along with primary key and foreign key
dependencies, can be depicted by schema diagrams. Figure 1.12
shows the schema diagram for our university organization. Each
relation appears as a box, with the relation name at the top in
blue, and the attributes listed inside the box. Primary key
attributes are shown underlined. Foreign key dependencies
appear as arrows from the foreign key attributes of the
referencing relation to the primary key of the referenced relation
❖ Referential integrity constraints other than foreign key constraints
are not shown explicitly in schema diagrams. We will study a
different diagrammatic representation called the entity-
relationship diagram
CONSTRAINTS OVER RELATIONS
Schema Diagrams
CONSTRAINTS OVER RELATIONS
Schema Diagram for Hospitality Database:
Integrity Constraints
Schema Diagram for Hospitality Database:
❖ Integrity constraints are a set of rules. It is used to maintain the quality of
information.
❖ Integrity constraints ensure that the data insertion, updating, and other
processes have to be performed in such a way that data integrity is not
affected.
❖ Thus, integrity constraint is used to guard against accidental damage to the
database.
Types of Integrity Constraint
Integrity Constraint
sname course
sid
Branch
STUDENT
Relational Model
Student Table: SID SNAME COURSE BRANCH
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
II)For strong entity set with composite Attributes
Example:
E-R Diagram: F_name
L_name
sname course
sid
Branch
STUDENT
sname City
Rollno
Mobile no
STUDENT
Loan_
LOAN PAYMENT
Pay
Pay_amt
sname City
Rollno
Mobile no
STUDENT
Type FEE
Relational table:
Table1: create table student(sid number(4) primary key, sname char(10));
Table2: create table course(cid number(4) primary key, cname char(10),fee number(5);
Table3: create table enroll(sid number(4),cid number(4),type char(14), primary key(sid,cid));
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
IV)Converting E R Diagram into Relational Model
➢ From the above E R diagram we can also convert as with combination of enroll
& Student or enroll & course but data duplication is more and face a lot of DML
anomalies, for example
Scenario1:
Scenario-2:
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
IV)Converting E R Diagram into Relational Model (One to Many Relationship)
Example:
E-R Diagram:
SID SNAME
ID FNAME
❑ All the available information about the Works In2 table is captured by the
following SQL definition:
❑ CREATE TABLE Works In2 ( ssn CHAR(11),did INTEGER,address CHAR(20),since DATE, PRIMARY
KEY (ssn, did, address),FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (address)
REFERENCES Locations,FOREIGN KEY (did) REFERENCES Departments)
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
Translating Relationship Sets with Key Constraints
❖ If a relationship set involves n entity sets and some m of them are linked via
arrows in the ER diagram, the key for any one of these m entity sets constitutes
a key for the relation to which the relationship set is mapped. Thus we have m
candidate keys, and one of these should be designated as the primary key.
Consider the relationship set Manages shown in diagram.
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
❖ However, because each department has at most one manager, no two tuples
can have the same did value but differ on the ssn value. A consequence of this
observation is that did is itself a key for Manages; The Manages relation can be
defined using the following SQL statement:
❖ CREATE TABLE Manages (ssnCHAR(11),did INTEGER,since DATE,PRIMARY
KEY (did), FOREIGN KEY (ssn) REFERENCES Employees,FOREIGN KEY (did)
REFERENCES Departments )
❖ Dependents entity can be identified uniquely only if we take the key of the
owning Employees entity and the pname of the Dependents entity, and the
Dependents entity must be deleted if the owning Employees entity is deleted.
We can capture the desired semantics with the following de_nition of the Dep
Policy relation:
❖ CREATE TABLE Dep Policy ( pname CHAR(20),age INTEGER,cost REAL,ssn CHAR(11), PRIMARY
KEY (pname, ssn),FOREIGN KEY (ssn) REFERENCES EmployeesON DELETE CASCADE ) Observe
that the primary key is pname, ssn, since Dependents is a weak entiy
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
Translating Class Hierarchies
❖ We present the two basic approaches to handling ISA hierarchies by applying
them to the ER diagram shown in following diagram:
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
❖ We can map each of the entity sets Employees, Hourly Emps, and Contract
Emps to a distinct relation. The Employees relation is created as in Section 2.2.
We discuss Hourly Emps here; Contract Emps is handled similarly. The relation
for Hourly Emps includes the hourly wages and hours worked attributes of
Hourly Emps.
❖ Alternatively, we can create just two relations, corresponding to Hourly Emp
sand Contract Emps. The relation for Hourly Emps includes all the attributes of
Hourly Emps as well as all the attributes of Employees (i.e., ssn, name,
lot,hourly wages, hours worked). The first approach is general and is always
applicable. Queries in which we want to examine all employees and do not care
about the attributes specific to the subclasses are handled easily using the
Employees relation. The second approach is not applicable if we have
employees who are neither hourly employees nor contract employees, since
there is no way to store such employees. Also,if an employee is both an Hourly
Emps and a Contract Emps entity, then the name and lot values are stored
twice.
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
Translating ER Diagrams with Aggregation
❖ Translating aggregation into the relational model is easy because there is no
real distinction between entities and relationships in the relational model.
LOGICAL DATABASE DESIGN: ER TO RELATIONAL
• Consider the ER diagram shown in above figure. The Employees, Projects, and
Departments entity sets and the Sponsors relationship set are mapped as
described in previous sections. For the Monitors relationship set, we create a
relation with the following attributes: the key attributes of Employees (ssn), the
key attributes of Sponsors (did, pid), and the descriptive attributes of Monitors
(until). This translation is essentially the standard mapping for a relationship
set, as described in Section 3.5.2. There is a special case in which this
translation can be re_ned further by dropping the Sponsors relation. Consider
the Sponsors relation. It has attributes pid, did, and since, and in general we
need it (in addition to Monitors) for two reasons:
• 1. We have to record the descriptive attributes (in our example, since) of the
Sponsors relationship.
• 2. Not every sponsorship has a monitor, and thus some pid, did pairs in the
Sponsors relation may not appear in the Monitors relation. However, if
Sponsors has no descriptive attributes and has total participation in Monitors,
every possible instance of the Sponsors relation can be obtained by looking at
the pid, did columns of the Monitors relation. Thus, we need not store the
Sponsors relation in this case
INTRODUCTION TO VIEWS
❖ A view is tables whose row is not explicitly stored in the database but is
computed as needed from a view definition. Consider the Students and
Enrolled relations. Suppose that we are often interested in finding the names
and student identifiers of students who got a grade of B in some course,
together with the cid for the course.
❖ We can done a view for this purpose. Using SQL-92 notation:
CREATE VIEW B-Students (name, sid, course)AS SELECT S.sname, S.sid, E.cid
FROM Students S, Enrolled EWHERE S.sid = E.sid AND E.grade = `B';
❖ The view B-Students has three fields called name, sid, and course with the
same domains as the Fields sname and sid in Students and cid in Enrolled. This
view can be used just like a base table, or explicitly stored table, in defining new
queries or views.
Updates on Views
❖ The motivation behind the view mechanism is to tailor how users see the data.
Users should not have to worry about the view versus base table distinction.
This goal is indeed achieved in the case of queries on views; a view can be used
just like any other relation in de_ning a query.
DESTROYING/ALTERING TABLES AND VIEWS
❖ If we decide that we no longer need a base table and want to destroy it (i.e.,
delete all the rows and remove the table definition information), we can use
the DROP TABLE command. For example, DROP TABLE Students RESTRICT
destroys the Students table unless some view or integrity constraint refers to
Students; if so, the command fails.
❖ If the keyword RESTRICT is replaced by CASCADE, Students is dropped and any
referencing views or integrity constraints are (recursively) dropped as well; one
of these two keywords must always be specified. A view can be dropped using
the DROP VIEW command, which is just like DROP TABLE.
❖ ALTER TABLE modifies the structure of an existing table. ALTER TABLE can also
be used to delete columns and to add or drop integrity constraints on a table;
we will not discuss these aspects of the command beyond remarking that
dropping columns is treated very similarly to dropping tables or views. Query
languages are specialized languages for asking questions, or queries that
involve the data in a database.
Relational Algebra
❖ Relational algebra is one of the two formal query languages associated with the
relational model. Queries in algebra are composed using a collection of
operators.
❖ A fundamental property is that every operator in the algebra accepts (one or
two) relation instances as arguments and returns a relation instance as the
result. This property makes it easy to compose operators to form a complex
query
❖ Relational algebra expression is recursively defined to be a relation, a unary
algebra operator applied to a single expression, or a binary algebra operator
applied to two expressions.
❖ We describe the basic operators of the algebra (selection, projection, union,
cross-product, and difference), as well as some additional operators that can be
defined in terms of the basic operators. Each relational query describes a step-
by-step procedure for computing the desired answer, based on the order in
which operators are applied in the query. The procedural nature of the algebra
allows us to think of an algebra expression as a recipe, or a plan, for evaluating
a query, and relational systems in fact use algebra expressions to represent
query evaluation plans.
Relational Algebra
Types of Relational operation
Another examples −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450" (Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450.
σsubject = "database" and price = "450" or year > "2010"(Books)
Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those
books published after 2010.
Relational Algebra
Project Operation:
❖ This operation shows the list of those attributes that we wish to appear in the
result. Rests of the attributes are eliminated from the table.
It is denoted by ∏.
Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.
Example: CUSTOMER RELATION
Relational Algebra
Input:
∏ NAME, CITY (CUSTOMER)
example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the
relation Books.
Relational Algebra
Union Operation:
❖ Suppose there are two Relations R and S. The union operation contains all the
tuples that are either in R or S or both in R & S.
❖ It eliminates the duplicate tuples. It is denoted by ∪.
Notation: R ∪ S OR
A union operation must hold the following condition:
R and S must have the attribute of the same number.
Duplicate tuples are eliminated automatically.
Example: DEPOSITOR RELATION
Relational Algebra
Union Operation:
Input:
∏ CUSTOMER_NAME, (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)
Relational Algebra
Set Intersection:
❖ Suppose there are two Relations R and S. The set intersection operation
contains all tuples that are in both R & S.
❖ It is denoted by intersection ∩.
Notation: R ∩ S
Example: Using the above DEPOSITOR table and BORROW table
Input:
∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
Relational Algebra
Set Difference:
❖ Suppose there are two Relations R and S. The set intersection operation
contains all tuples that are in R but not in S.
❖ It is denoted by intersection minus (-).
Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table
Input:
∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
Relational Algebra
Cartesian product
❖ The Cartesian product is used to combine each row in one table with each row
in the other table. It is also known as a cross product.
❖ It is denoted by X.
Notation: E X D
Example:
Relational Algebra
Input:
EMPLOYEE X DEPARTMENT
Output:
Relational Algebra
Rename Operation:
❖ The rename operation is used to rename the output relation. It is denoted by
rho (ρ).
Example:
We can use the rename operator to rename STUDENT relation to STUDENT1.
ρ(STUDENT1, STUDENT)
Join Operations:
❖ A Join operation combines related tuples from different relations, if and only if
a given join condition is satisfied. It is denoted by ⋈.
Relational Algebra
Operation: (EMPLOYEE ⋈ SALARY)
Result:
Relational Algebra
Types of Join operations
Relational Algebra
Natural Join:
❖ A natural join is the set of tuples of all combinations in R and S that are equal
on their common attribute names.
It is denoted by ⋈.
Example: Let's use the above EMPLOYEE table and SALARY table:
Input:
∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY)
Output:
Relational Algebra
Equi join:
❖ It is also known as an inner join. It is the most common join. It is based on
matched data as per the equality condition. The equi join uses the comparison
operator (=).
Example:
Relational Algebra
Input:
CUSTOMER ⋈ PRODUCT
Output:
Outer Join:
❖ The outer join operation is an extension of the joint operation. It is used to deal
with missing information.
Example: EMPLOYEE
Relational Algebra
Outer Join:
(EMPLOYEE ⋈ FACT_WORKERS)
Output:
Relational Algebra
An outer join is basically of three types:
➢Left outer join
➢Right outer join
➢Full outer join
Left outer join:
❖ Left outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
❖ In the left outer join, tuples in R have no matching tuples in S.
❖ It is denoted by ⟕.
Example: Using the above EMPLOYEE table and FACT_WORKERS table
Input:
EMPLOYEE ⟕ FACT_WORKERS
Relational Algebra
Right outer join:
o Right outer join contains the set of tuples of all combinations in R and S that are
equal on their common attribute names.
o In right outer join, tuples in S have no matching tuples in R.
o It is denoted by ⟖.
Example: Using the above EMPLOYEE table and FACT_WORKERS Relation
Input:
EMPLOYEE ⟖ FACT_WORKERS
Output:
Relational Algebra
Full outer join:
o Full outer join is like a left or right join except that it contains all rows from both
tables.
o In full outer join, tuples in R that have no matching tuples in S and tuples in S
that have no matching tuples in R in their common attribute name.
o It is denoted by ⟗.
Example: Using the above EMPLOYEE table and FACT_WORKERS table
Input:
EMPLOYEE ⟗ FACT_WORKERS
Output:
Relational Algebra
Division operation
❖ The division operator is used for queries which involve the ‘all’.
❖ R1 ÷ R2 = tuples of R1 associated with all tuples of R2.
Example
Retrieve the name of the subject that is taught in all courses
Relational Algebra
Example
• Retrieve names of employees who work on all the projects that John Smith
works on.
• Consider the Employee table given below −