unit 1 dbms.patel
unit 1 dbms.patel
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
Understand the database concept, system architecture and role of database
CO1
administrator
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
• Overview of Databases
• Database concepts, DBMS,
• Importance of DBMS
• Advantages and Disadvantages of DBMS
• DBA and Responsibilities of DBA
5
Common Terminologies
• Data: Facts, figures, statistics etc. having no particular meaning (e.g.
1, ABC, 19 etc).
• Record: Collection of related data items, e.g. in the above example the
three data items had no meaning. But if we organize them in the
following way, then they collectively represent meaningful
information.
https://beginnersbook.com/2015/04/e-r-model-in-dbms/
6
Cont…
• Table or Relation: Collection of related records
https://beginnersbook.com/2015/04/e-r-model-in-dbms/
The columns of this relation are called Fields, Attributes or Domains. The rows are called Tuples or Records.
7
Cont…
• Database: Collection of related relations. Consider the following
collection of tables:
https://beginnersbook.com/2015/04/e-r-model-in-dbms/
8
Files and Databases
• File: A collection of records or documents dealing with one organization, person,
area or subject (Rowley)
• Manual (paper) files
• Computer files
• Database: A collection of similar records with relationships between the records
(Rowley)
• Bibliographic, statistical, business data, images, etc.
9
Purpose of DBMS
• In the early days, database applications were built directly on top of file systems
• Drawbacks of using file systems to store data:
• Data redundancy and inconsistency
• Multiple file formats, duplication of information in different files
• Difficulty in accessing data
• Need to write a new program to carry out each new task
• Data isolation — multiple files and formats
• Integrity problems
• Integrity constraints (e.g. account balance > 0) become “buried” in
program code rather than being stated explicitly
• Hard to add new constraints or change existing ones
10
Cont…
• Drawbacks of using file systems (cont.)
• Atomicity of updates
• Failures may leave database in an inconsistent state with partial updates carried
out
• Example: Transfer of funds from one account to another should either complete
or not happen at all
• Concurrent access by multiple users
• Concurrent accessed needed for performance
• Uncontrolled concurrent accesses can lead to inconsistencies
• Example: Two people reading a balance and updating it at the same time
• Security problems
• Hard to provide user access to some, but not all, data
• Database systems offer solutions to all the above problems
11
Importance of DBMS
• It helps make data management more efficient and effective.
• Its query language allows quick answers to ad hoc queries.
• It provides end users better access to more and better-managed data.
• It promotes an integrated view of organization’s operations -- “big picture.”
• It reduces the probability of inconsistent data.
• Data Independence
• Efficient data access
• Data integrity and security
• Data administration
• Concurrent access and crash recovery
12
Examples of Database application
• Banking: all transactions
• Airlines: reservations, schedules
• Universities: registration, grades
• Sales: customers, products, purchases
• Online retailers: order tracking, customized recommendations
• Manufacturing: production, inventory, orders, supply chain
• Human resources: employee records, salaries, tax deductions
13
Database users
• Data Base Administrator (DBA):-
• Authorizing access to the database
• Coordinating and monitoring its use
• Acquiring software and hardware resources
• Database designers are responsible for:
• Identifying the data to be stored
• Choosing appropriate structures to represent and store this data
• System analysts
• Determine requirements of end users
• Application programmers
• Implement these specifications as programs
14
Cont…
• End users : People whose jobs require access to the database
• Types
• Casual end users: access database occasionally by sophisticated query language when
needed.
(Manager)
• Naive or parametric end users: they make up a large section of the end-user population.
Learn only a few facilities that they may use repeatedly
(bank clerk)
• Sophisticated end users: These include business analysts, scientists, engineers, others
thoroughly familiar with the system capabilities.
• Standalone users: Normal users
15
Database Administrator
• Coordinates all the activities of the database system
• has a good understanding of the enterprise’s information resources and needs.
• Database administrator's duties include:
• Schema definition
• Storage structure and access method definition
• Schema and physical organization modification
• Granting users authority to access the database
• Backing up data
• Monitoring performance and responding to changes
• Database tuning
16
Database Management System(DBMS)
https://beginnersbook.com/2015/04/e-r-model-in-dbms/
17
Summary
In the early days, database applications were built directly on top of file
systems but due to many drawbacks of the file systems there is a need
for the paradigm shift from file systems to the Database Management
Systems.
18
FAQs
• Why the paradigm shift from file system to database system is important?
• What is the importance of Database Management System?
• List some of the applications of DBMS.
• What is DBA?
• What are the role and responsibilities of DBA.
References
• RamezElmasri and Shamkant B. Navathe, “Fundamentals of Database System”, The
Benjamin / Cummings Publishing Co.
• Korth and Silberschatz Abraham, “Database System Concepts”, McGraw Hall.
• C.J.Date, “An Introduction to Database Systems”, Addison Wesley.
• Thomas M. Connolly, Carolyn & E. Begg, “Database Systems: A Practical Approach to
Design, Implementation and Management”, 5/E, University of Paisley, Addison-
Wesley.
20
THANK YOU
For queries
Email: [email protected]
21
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
Understand the database concept, system architecture and role of database
CO1
administrator
3
Outline
• Type of system Architectures
• Data Base System Architecture (Three Level ANSI-SPARC
Architecture)
• Data Independence
• Types of Data Independence
4
DBMS Architecture
• The DBMS design depends upon its architecture. The basic
client/server architecture is used to deal with a large number of PCs,
web servers, database servers, and other components that are
connected with networks.
• The client/server architecture consists of many PCs and a workstation
that are connected via the network.
• DBMS architecture depends upon how users are connected to the
database to get their requests done.
5
Type of Architectures
Image Source:Javatpoint.com
6
1-Tier Architecture
7
2-Tier Architecture
• The 3-Tier architecture contains another layer between the client and
the server. In this architecture, the client can't directly communicate
with the server.
• The application on the client-end interacts with an application server
which further communicates with the database system.
• End-user has no idea about the existence of the database beyond the
application server. The database also has no idea about any other user
beyond the application.
• The 3-Tier architecture is used in case of the large web applications.
9
Three schema Architecture
10
3-schema/views of Data
• Physical level: describes how a record (e.g., customer) is stored.
• Logical level: describes data stored in database, and the relationships
among the data.
• View level: application programs hide details of data types. Views can also
hide information (such as an employee’s salary) for security purposes.
• The database can be viewed from different levels of abstraction to reveal
different levels of details. From a bottom-up manner, we may find that there
are three levels of abstraction or views in the database.
• The term Abstraction is very important here. Generally it means the amount
of detail you want to hide. Any entity can be seen from different
perspectives and levels of complexity to make it a reveal its current amount
of abstraction.
11
https://beginnersbook.com/2015/04/e-r-model-in-dbms/
• The word schema means arrangement – how we want to arrange things that we have to store. The diagram above shows the three different schemas used in DBMS, seen from
different levels of abstraction.
12
Cont…
Three General levels :-
Internal Schema (Physical View) :-
The way the data is stored in the storage media. (Specified by the
DBA)
Conceptual Schema (Logic View):-
Describes the structure and constraints for the whole database.
(Specified and used by the programmers).
External Schema(Sub-Schema):-
The view of the database as seen by the end user.
13
Internal or physical schema
• The lowest level, called the Internal or Physical schema, deals with the
description of how raw data items (like 1, ABC, KOL, H2 etc.) are
stored in the physical storage (Hard Disc, CD, Pen Drive etc.).
• It also describes the data type of these data items, the size of the items
in the storage media, the location (physical address) of the items in the
storage device and so on. This schema is useful for database
application developers and database administrator.
14
Conceptual or logical schema
• The middle level is known as the Conceptual or Logical Schema, and
deals with the structure of the entire database.
• Please note that at this level we are not interested with the raw data
items anymore, we are interested with the structure of the database.
• This means we want to know the information about the attributes of
each table, the common attributes in different tables that help them to
be combined, what kind of data can be input into these attributes, and
so on.
• Conceptual or Logical schema is very useful for database
administrators whose responsibility is to maintain the entire database.
15
External or View Schema
• The highest level of abstraction is the External or View Schema.
• This is targeted for the end users.
• Now, an end user does not need to know everything about the structure
of the entire database, rather than the amount of details he/she needs to
work with.
16
Data Independence
• It is the property of the database which tries to ensure that if we make any change in any
level of schema of the database, the schema immediately above it would require minimal
or no need of change.
• Ability to modify a schema definition in one level without affecting a schema definition
in the next higher level.
• The interfaces between the various levels and components should be well defined so that
changes in some parts do not seriously influence others.
• What does this mean? We know that in a building, each floor stands on the floor below it.
If we change the design of any one floor, e.g. extending the width of a room by
demolishing the western wall of that room, it is likely that the design in the above floors
will have to be changed also. As a result, one change needed in one particular floor would
mean continuing to change the design of each floor until we reach the top floor, with an
increase in the time, cost and labour. Would not life be easy if the change could be
contained in one floor only? Data independence is the answer for this. It removes the need
for additional amount of work needed in adopting the single change into all the levels
above.
17
Types of Data Independence
Data independence can be classified into the following two types:
1. Physical Data Independence
2. Logical Data Independence
Physical Data Independence: This means that for any change made in the
physical schema, the need to change the logical schema is minimal. This is
practically easier to achieve.
Logical Data Independence: This means that for any change made in the
logical schema, the need to change the external schema is minimal. As we
shall see, this is a little difficult to achieve.
18
Summary
19
FAQs
21
THANK YOU
For queries
Email: [email protected]
22
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
Understand the database concept, system architecture and role of database
CO1
administrator
3
Attribute Types
• The set of allowed values for each attribute is called the domain of the
attribute
• Attribute values are (normally) required to be atomic; that is,
indivisible
• The special value null is a member of every domain. Indicated that
the value is “unknown”
• The null value causes complications in the definition of many
operations
4
Integrity Constraints
5
Integrity Constraints
• Domain Constraints
• Allowable values for an attribute.
• Domain integrity means the definition of a valid set of values for an attribute.
You define
- data type,
- lenght or size
- is null value allowed
- is the value unique or not for an attribute.
• Entity Integrity
• No primary key attribute may be null. All primary key fields MUST have data
Entity:- Any thing which has some attributes is called an entity. Like hospital,
doctor, car etc
6
Kinds of Integrity Constraints
• Tables Operations:- ADD, Delete, Append and Update.
• Integrity Rules :-
PK must be unique.
Related fields should have the same field type.
Related tables should belong to the same DB.
7
Cont...
• Integrity Conditions:-
Foreign Key values must be identical to PK values.
Records of primary tables shouldn’t be deleted if it is related to another
table.
Primary key shouldn’t be changed if this record is related to another table.
• Cascade update related fields :-
During updating the PK in the primary table the value of the FK should be
updated automatically.
• Cascade deletes related records :-
During delete a record from the primary table all related records in related
tables should be deleted also.
8
Referential Integrity Constraint
The rule that states that any foreign key value (on the relation of the many
side) MUST match a primary key value in the relation of the one side. (Or
the foreign key can be null)
• The rules are:
1. You can't delete a record from a primary table if matching records
exist in a related table.
2. You can't change a primary key value in the primary table if that
record has related records.
3. You can't enter a value in the foreign key field of the related table that
doesn't exist in the primary key of the primary table.
4. However, you can enter a Null value in the foreign key, specifying
that the records are unrelated.
9
Cont...
10
Entity integrity rule
The entity integrity constraint states that primary keys can't be null.
There must be a proper value in the primary key field.
This is because the primary key value is used to identify individual rows
in a table. If there were null values for primary keys, it would mean that
we could not indentify those rows.
On the other hand, there can be null values other than primary key
fields. Null value means that one doesn't know the value for that field.
Null value is different from zero value or space.
11
Key Constraints
• Values in a column (or columns) of a relation are unique: at most one row in
a relation instance can contain a particular value(s)
12
Cont...
• Primary Key
Primary key is a candidate key that is most appropriate to become main key of
the table. It is a key that uniquely identify each record in a table. PK must be
unique and Not Null.
• Foreign Key
Foreign Key is a field or set of fields that are identical to a primary key in
another table.
13
Cont...
• Composite Key
Key that consist of two or more attributes that uniquely identify an entity
occurance is called Composite key. But any attribute that makes up
the Composite key is not a simple key in its own.
• Secondary or Alternative key
The candidate key which are not selected for primary key are known as
secondary keys or alternative keys
• Non-key Attribute
Non-key attributes are attributes other than candidate key attributes in a table.
• Non-prime Attribute
Non-prime Attributes are attributes other than Primary attribute.
14
Cont...
• Primary Key
Customers
15
Cont...
16
Summary
• Integrity constraints are a set of rules. It is used to maintain the
quality of information.
• Thus, integrity constraint is used to guard against accidental damage
to the database.
• There four types of integrity constraint discussed:
• Domain- integrity constraint
• Entity-integrity Constraint
• Key constraint
• Referential- integrity constraint.
17
FAQs
19
THANK YOU
For queries
Email: [email protected]
20
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
5
What is Data Model
• A data model is an abstract model that organizes elements of data
• It standardizes how they relate to one another and to the properties
of real-world entities.
• Data models define how the logical structure of a database is
modeled.
• Data Models are fundamental entities to introduce abstraction in a
DBMS.
• Data models define how data is connected to each other and how
they are processed and stored inside the system.
6
Need
To aid in the development of a sound database design that does not
allow anomalies or inconsistencies
Goal: to create database tables that do not contain duplicate data values
that can become inconsistent
7
Hierarchical Model
slideplayer.com
8
Equivalent model
Slideplayer.com
9
Cont...
• Advantages
• Many of the hierarchical data model’s features formed the foundation for
current data models
• Its database application advantages are replicated, implemented in a different
form, in current database environments
• Generated a large installed (mainframe) base, created a pool of programmers
who developed numerous tried-and-true business applications
• Disadvantages
• Complex to implement
• Difficult to manage
• Lacks structural independence
• Implementation limitations
• Lack of standards
10
Network Model
slideplayer.com
11
Equivalent
slideplayer.com
12
Cont...
• Advantages
• Represent complex data relationships more effectively
• Improve database performance
• Impose a database standard
• Disadvantages
• Too cumbersome
• The lack of ad hoc query capability put heavy pressure on programmers
• Any structural change in the database could produce havoc in all application
programs that drew data from the database
• Many database old-timers can recall the interminable information delays
13
Relational Model
Slideplayer.com
14
Equivalent model
15
Relational Database
A database whose logical organization is based on relational data
model is a Relational Database
Quantumcomputingtech.blogspot.com
16
Relational Model
The main highlights of this model are −
• Data is stored in tables called relations.
• Relations can be normalized.
• In normalized relations, values saved are atomic values.
• Each row in a relation contains a unique value.
• Each column in a relation contains values from a same domain.
17
Entity- Relationship Model
• Entity-Relationship (ER) Model is based on the notion of real-world
entities and relationships among them. While formulating real-world
scenario into the database model, the ER Model creates entity set,
relationship set, general attributes and constraints.
• ER Model is best used for the conceptual design of a database.
tutorialspoint.com
18
Schema vs Instances
• Database Schema: The description of a database. Includes
descriptions of the database structure and the constraints that should
hold on the database.
• Schema Diagram: A diagrammatic display of (some aspects of) a
database schema.
• Database Instance: The actual data stored in a database at a
particular moment in time. Also called database state (or
occurrence).
19
DBMS Schema vs DBMS State
• Database State: Refers to the content of a database at a moment in time.
• Initial Database State: Refers to the database when it is loaded
• Valid State: A state that satisfies the structure and constraints of the database.
• Distinction
• The database schema changes very infrequently. The database state changes
every time the database is updated.
• Schema is also called intension, whereas state is called extension.
20
Summary
• A Data Model is a precise description of the data content in a system.
• Data Models are required to aid in the development of a sound
database design that does not allow anomalies or inconsistencies
where the basic goal is to create database tables that do not contain
duplicate data values that can become inconsistent.
21
FAQs
23
THANK YOU
For queries
Email: [email protected]
24
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
5
Entity- Relationship Model
6
Cont...
Types of Attributes
• Simple attribute − Simple attributes are atomic values, which cannot be
divided further. For example, a student's phone number is an atomic value
of 10 digits.
• Composite attribute − Composite attributes are made of more than one
simple attribute. For example, a student's complete name may have
first_name, middle_name and last_name.
• Derived attribute − Derived attributes are the attributes that do not exist in
the physical database, but their values are derived from other attributes
present in the database. For example, average_salary in a department
should not be saved directly in the database, instead it can be derived. For
another example, age can be derived from data_of_birth.
7
Cont...
Types of Attributes
• Single-value attribute − Single-value attributes contain single value.
For example − Social_Security_Number.
• Multi-value attribute − Multi-value attributes may contain more than
one values. For example, a person can have more than one phone no.
8
Cont...
Relationships:
• The association among entities is called a relationship. For example, an employee
works_at a department, a student enrolls in a course. Here, Works_at and Enrolls are
called relationships
Relationship Set
• A set of relationships of similar type is called a relationship set. Like entities, a relationship
too can have attributes. These attributes are called descriptive attributes.
Degree of Relationship
• The number of participating entities in a relationship defines the degree of the
relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
9
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can
be associated with the number of entities of other set via relationship
set.
• Express the number of entities to which another entity can be
associated via a relationship set.
• Most useful in describing binary relationship sets.
10
Types of Relationship
One - One Relationship:- (1 – 1)
Each value in the first table could relate with only one record in the second table.
One – Many Relationship:- (1 - ∞)
Each value in the first table could relate with many records in the second table.
Many – Many Relationship (∞ - ∞)
Each value in the first table could relate with many records in the second table
and each value of the second table could relate with many records in the first table.
Many – One Relationship(∞-1)
• More than one entities from entity set A can be associated with at most one entity
of entity set B, however an entity from entity set B can be associated with more
than one entity from entity set A.
11
Cont...
One - One Relationship:- (1 – 1)
Each value in the first table could relate with only one record in the
second table
12
Cont...
One – Many Relationship:- (1 - ∞)
Each value in the first table could relate with many records in the
second table.
13
Cont...
• Many-to-many − One entity from A can be associated with more than
one entity from B and vice versa
14
Cont...
• Many-to-one − More than one entities from entity set A can be
associated with at most one entity of entity set B, however an entity
from entity set B can be associated with more than one entity from
entity set A.
15
ER- Diagram
Rectangles represent Entity Sets.
Diamonds represent Relationship Sets.
Lines link attributes to entity sets and entity sets to relationship sets.
Ellipses represent Attributes
Double Ellipses represent Multivalued Attributes.
Dashed Ellipses denote Derived Attributes.
Underline indicates Primary Key attributes (will study later)
16
ER-Diagram with Composite, Multivalued and
Derived value
17
Design Issues
• Use of entity sets vs. attributes
Choice mainly depends on the structure of the enterprise being modeled,
and on the semantics associated with the attribute in question.
• Use of entity sets vs. relationship sets
Possible guideline is to designate a relationship set to describe an action that
occurs between entities
• Binary versus n-ary relationship sets
Although it is possible to replace any nonbinary (n-ary, for n > 2) relationship
set by a number of distinct binary relationship sets, a n-ary relationship set
shows more clearly that several entities participate in a single relationship.
• Placement of relationship attributes
Participation Constraints
• Total Participation − Each entity is involved in the relationship. Total
participation is represented by double lines.
• Partial participation − Not all entities are involved in the relationship.
Partial participation is represented by single lines.
19
Weak Entity Sets
• An entity set that does not have a primary key is referred to as a weak
entity set.
• The existence of a weak entity set depends on the existence of a identifying
entity set
• it must relate to the identifying entity set via a total, one-to-many relationship set
from the identifying to the weak entity set
• Identifying relationship depicted using a double diamond
• The discriminator (or partial key) of a weak entity set is the set of
attributes that distinguishes among all the entities of a weak entity set.
• The primary key of a weak entity set is formed by the primary key of the
strong entity set on which the weak entity set is existence dependent, plus
the weak entity set’s discriminator.
Weak Entity Sets (Cont.)
• We depict a weak entity set by double rectangles.
• We underline the discriminator of a weak entity set with a
dashed line.
• payment-number – discriminator of the payment entity set
• Primary key for payment – (loan-number, payment-number)
Generalization and Specialization
22
Generalization
33
FAQs
• Explain ER Model
• What are the design issues?
• What are derived and multivalued attributes?
• What is cardinality?
• What are the different kind of relationships we have in ER Model?
References
• RamezElmasri and Shamkant B. Navathe, “Fundamentals of Database
System”, The Benjamin / Cummings Publishing Co.
• Korth and Silberschatz Abraham, “Database System Concepts”,
McGraw Hall.
• C.J.Date, “An Introduction to Database Systems”, Addison Wesley.
• Thomas M. Connolly, Carolyn & E. Begg, “Database Systems: A Practical
Approach to Design, Implementation and Management”, 5/E,
University of Paisley, Addison-Wesley.
35
THANK YOU
For queries
Email: [email protected]
36
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
On completion of this course, the students shall be able to:-
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
tuples
(or rows)
Attribute Types
• The set of allowed values for each attribute is
called the domain of the attribute
• Attribute values are (normally) required to be
atomic; that is, indivisible
• The special value null is a member of every
domain. Indicated that the value is “unknown”
• The null value causes complications in the
definition of many operations
Relation Schema and Instance
• A1, A2, …, An are attributes
14
THANK YOU
For queries
Email: [email protected]
15
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
CO3 Apply relational algebra and relational calculus to query the database of organization
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
Data Models: Relational Model, Network Model, Hierarchical Model, ER Model:
Design, issues, Mapping constraints, ER diagram, Comparison of
Models
Relational Introduction, Syntax, Semantics, Additional operators, Grouping and
Algebra & Ungrouping, Relational comparisons, Tuple Calculus, Domain
Relational Calculus, Calculus Vs Algebra, Computational capabilities
Calculus:
4
Formal Relational Query Languages
(Relational Algebra)
Relational Query Languages
• Relational Algebra
• Tuple Relational Calculus
• Domain Relational Calculus
Relational Algebra
• Procedural language
• Six basic operators
• select:
• project:
• union:
• set difference: –
• Cartesian product: x
• rename:
• The operators take one or two relations as inputs and produce a new
relation as a result.
Select Operation
• Notation: p(r)
Where p is a formula in propositional calculus consisting of terms connected by : (and), (or), (not)
Each term is one of:
<attribute> op <attribute> or <constant>
where op is one of: =, , >, . <.
• Example of selection:
dept_name=“Physics”(instructor)
Select Operation – selection of rows (tuples)
Relation r
• Relation r:
A,C (r)
Union Operation
• Notation: r s
• Defined as:
r s = {t | t r or t s}
• For r s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
• Example: to find all courses taught in the Fall 2009 semester, or in the Spring 2010
semester, or in both
course_id ( semester=“Fall” Λ year=2009 (section))
r s:
Set Difference Operation
• Notation r – s
• Defined as:
r – s = {t | t r and t s}
r – s:
Set-Intersection Operation
• Notation: r s
• Defined as:
• r s = { t | t r and t s }
• Assume:
• r, s have the same arity
• attributes of r and s are compatible
• Note: r s = r – (r – s)
Set intersection of two relations
• Relation r, s:
•r s
Note: r s = r – (r – s)
Cartesian-Product Operation
•Notation r x s
• Defined as:
r x s = {t q | t r and q s}
• Assume that attributes of r(R) and s(S) are disjoint. (That is, R S =
).
• If attributes of r(R) and s(S) are not disjoint, then renaming must be
used.
Joining Two Relations -- Cartesian-product
Relations r, s:
r x s:
Cartesian-product – naming issue
Relations r, s: B
r x s: r.B s.B
Rename Operation
• Allows us to name, and therefore to refer to, the results of relational-algebra
expressions.
• Allows us to refer to a relation by more than one name.
• Example:
x (E)
Relations r
•r x s
• A=C (r x s)
Joining two relations – Natural Join
• Let r and s be relations on schemas R and S respectively.
Then, the “natural join” of relations R and S is a relation on schema R
S obtained as follows:
• Consider each pair of tuples tr from r and ts from s.
• If tr and ts have the same value on each of the attributes in R S, add
a tuple t to the result, where:-
• t has the same value as tr on r
• t has the same value as ts on s
Natural Join Example
• Relations r, s:
Natural Join
r s
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
∪
(Union) Π name (instructor) ∪ Π name (student)
Output the set difference of tuples from the two input relations.
⋈
(Natural Join) instructor ⋈ department
Output pairs of rows from the two input relations that have the same value on
all attributes that have the same name.
Formal Definition
• A basic expression in the relational algebra consists of either one of the
following:
• A relation in the database
• A constant relation
• Let E1 and E2 be relational-algebra expressions; the following are all
relational-algebra expressions:
• E1 E2
• E1 – E2
• E1 x E2
• p (E1), P is a predicate on attributes in E1
• s(E1), S is a list consisting of some of the attributes in E1
• x (E1), x is the new name for the result of E1
References
29
THANK YOU
For queries
Email: [email protected]
30
APEX INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
2
COURSE OUTCOMES
CO3 Apply relational algebra and relational calculus to query the database of organization
3
Unit-1 Syllabus
Unit-1 Introduction to Databases and Relational Algebra
Overview of Database concepts, DBMS, Data Base System Architecture (Three
Databases: Level ANSI-SPARC Architecture), Advantages and Disadvantages of
DBMS, Data Independence, DBA and Responsibilities of DBA,
Relational Data Structure, Keys, Relations, Attributes, Schema and
Instances, Referential integrity, Entity integrity.
Data Models: Relational Model, Network Model, Hierarchical Model, ER Model:
Design, issues, Mapping constraints, ER diagram, Comparison of
Models
Relational Introduction, Syntax, Semantics, Additional operators, Grouping and
Algebra & Ungrouping, Relational comparisons, Tuple Calculus, Domain
Relational Calculus, Calculus Vs Algebra, Computational capabilities
Calculus:
4
Additional Operations (Relational Algebra)
• Natural Join
• Outer Join
Natural-Join Operation
Notation: r s
• Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as follows:
• Consider each pair of tuples tr from r and ts from s.
• If tr and ts have the same value on each of the attributes in R S, add a tuple t
to the result, where
• t has the same value as tr on r
• t has the same value as ts on s
• Example:
R = (A, B, C, D)
S = (E, B, D)
• Result schema = (A, B, C, D, E)
• r s is defined as:
r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))
Natural Join Operation – Example
• Relations r, s:
A B C D B D E
1 a 1 a
2 a 3 a
4 b 1 a
1 a 2 b
2 b 3 b
r s
r s
A B C D E
1 a
1 a
1 a
1 a
2 b
Outer Join
• Left-Outer Join
• Right-Outer Join
• Full-Outer Join
Outer Join – Example
• Relation loan
Relation borrower
customer-name loan-number
Jones L-170
Smith L-230
Hayes L-155
Outer Join – Example
• Inner Join
loan Borrower
• It is possible for tuples to have a null value, denoted by null, for some of
their attributes
• null signifies an unknown value or that a value does not exist.
• The result of any arithmetic expression involving null is null.
• Aggregate functions simply ignore null values
• Is an arbitrary decision. Could have returned null as result instead.
• We follow the semantics of SQL in its handling of null values
• For duplicate elimination and grouping, null is treated like any other value,
and two nulls are assumed to be the same
• Alternative: assume each null is different from each other
• Both are arbitrary decisions, so we simply follow SQL
Null Values
• Comparisons with null values return the special truth value unknown
• If false was used instead of unknown, then not (A < 5)
would not be equivalent to A >= 5
• Three-valued logic using the truth value unknown:
• OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
• AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
• NOT: (not unknown) = unknown
• In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown
• Result of select predicate is treated as false if it evaluates to
unknown
THANK YOU
For queries
Email: [email protected]
15