0% found this document useful (0 votes)
24 views71 pages

CS2202 Intro ER

CS2202 is a database and warehousing course taught by Dr. Samrat Mondal, with a structured evaluation policy including labs, quizzes, and exams. The course covers fundamental concepts of databases, including data models, database languages, and entity-relationship modeling. Students are expected to adhere to ethical standards and can access course materials online.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views71 pages

CS2202 Intro ER

CS2202 is a database and warehousing course taught by Dr. Samrat Mondal, with a structured evaluation policy including labs, quizzes, and exams. The course covers fundamental concepts of databases, including data models, database languages, and entity-relationship modeling. Students are expected to adhere to ethical standards and can access course materials online.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

CS2202: Database & Warehousing

Dr. Samrat Mondal

1
Instructor & TAs
• Instructor TAs
– Dr. Samrat Mondal • Raghvendra Kumar
– Office: 402, Block 2, CSE Department ([email protected])
– Email: [email protected] • Anshu Kumari ([email protected])
– Phone: 8163 • Aayush Raj ([email protected])
• Ujjwal Chaudhary
([email protected])
• Shambhavi ([email protected])
• Shlok Kaushik ([email protected])
• Sannu Kumar ([email protected]

2
• Course Structure
– CS2202 (3-0-2-4)
• Class Timings/ Venue
– Mon:Lab-2:00pm – 4:00pm/CC Lab
– Tue: 10:00am - 10:55am/ LT002
– Wed: 10:00am – 10:55am/ LT002
– Thur: 10:00am – 10:55am/ LT002
• Course Link
– Slides, lecture materials, assignments will be uploaded to
http://10.12.10.9/~samrat/2024_25/CS2202/

3
Evaluation Policy
– Lab: 25%
– Quizzes:20%
– MidSem: 25%
– EndSem: 30%

Important Note: Students are expected to adhere to ethical standards


throughout their course attendance and examination participation. Engaging in
any form of academic misconduct, such as malpractice, proxy attendance, or
providing false information to gain advantages, will result in severe penalties.

4
Books

5
Let’s Begin

6
Database
• A collection of interrelated data
• Usually designed to manage large bodies of information
• Models real world enterprise
– Entities (e.g. student, courses)
– Relationships (e.g. students are enrolled to courses)
• A database management system (DBMS) is a collection of interrelated data and a
set of programs to access those data in a convenient and efficient way

7
Some Representative Applications

8
File System vs Database System

File Database
System System 9
File Systems vs DBMS
• Data redundancy and inconsistency
• Difficulty in accessing data
• Data isolation
• Integrity problem
• Atomicity problem
• Concurrent access anomalies
• Security and access control

10
Why use a DBMS
• Data independence and efficient access
• Reduced application development time
• Data integrity and security
• Uniform data administration
• Concurrent access, recovery from crashes

11
View of Data
• A major purpose of database system is to provide users with an abstract view of
the data
• The data from the database must be retrieved efficiently
– use of complex data structures
• Since many users are not computer trained
– developers hide the complexity from users through several levels of
abstraction

12
Levels of Abstraction
• Many views, single conceptual
(logical) schema and physical View1 View2 … ViewN
schema.
• Views describe how different users Conceptual Schema
see the database
• Conceptual schema describes what Physical Schema
data are stored and what is the
relationships exist among those data
• Physical schema describes how the
data are actually stored DB

13
Example: University Database
• Physical schema:
– Relations stored as unordered files.
– Index on first column of Students.
• Conceptual schema:
– Students(sid: string, name: string, login: string, age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
• External Schema (View):
– Course_info(cid:string, sid: string)

14
Data Independence
• Applications insulated from how data is structured and stored
• Logical data independence: Protection from changes in logical structure of data
• Physical data independence: Protection from changes in physical structure of data
• Data Independence is one of the most important benefits of using a DBMS

15
Instances and Schemas
• Instance of the database: the collection of information stored in the database at a
particular moment
• Database schema: the overall design of the database

16
Data Models
• Underlying the structure of the database is the data model
– It is a collection of conceptual tools for describing data, data relationships,
data semantics and consistency constraints
– The relational model of data is widely used model
– Main concept: relation, basically a table with rows and columns
– Every relation has a schema, which describes the columns, or fields.

17
• Example of customer relation

18
Database Languages
• A database system provides –
– DDL (Data Definition Language): to specify the database schema
– DML (Data Manipulation Language): to express database queries and updates

These two form parts of a single database language such as the


widely used SQL (Structured Query Language)

19
DDL Example
create table account (
account_number char(10),
branch_name char(10),
balance integer)

20
DML
• Data manipulation may be-
– the retrieval of the information stored in the database
– the insertion of the new information in the database
– the deletion of the information in the database
– the modification of the information stored in the database
• Example
select customer_name
from customer
where customer_city = “Palo Alto”

21
Database Architecture

22
Two and Three Tier Architectures

23
Entity Relationship Model
• Widely used conceptual level data
model
– proposed by Peter P Chen in
1970s
• The ER model is one of the most
cited articles in Computer Science
– “The Entity-Relationship model –
toward a unified view of data”
Peter Chen, 1976

24
• Data model to describe the database system at the requirements collection stage
– high level description
– easy to understand for the enterprise managers
– rigorous enough to be used for system building
• Concepts available in the model
– entities and attributes of entities
– relationships between entities
– diagrammatic notation

25
Entities
• Entity:
– Real-world object distinguishable from other objects
– An entity is described (in DB) using a set of attributes
• In the University database context, an individual student, faculty member,
a class room, a course, etc. are entities
• Entity Set or Entity Type-
– Collection of entities all having the same properties.
– Student entity set –collection of all student entities.
– Course entity set –collection of all course entities.

26
Entity vs Entity Set
• Entities are not explicitly represented in ER Diagram

Attributes

Entities
Entity ID DoB
Set

Student

Name:Alice Name:Bob
ID: 2193 ID: 3316 Name
DoB: 27th Oct 2000 DoB: 12th Feb 1999 27
Attribute
• Each entity is described by a set of attributes/properties.
• Student entity
– StudName–name of the student.
– ID–the unique ID given to each student.
– Sex–the gender of the student etc.
• All entities in an Entity set/type have the same set of attributes.

28
Types of Attributes
• Simple Attributes
– having atomic or indivisible values.
– E.g. Dept–a string
– PhoneNumber–a ten digit number
• Composite Attributes
– having several components in the value.
– E.g. Qualification with components
– (DegreeName, Year, UniversityName)
• Derived Attributes
– Attribute value is dependent on some other attribute.
– E.g: Age depends on DateOfBirth. So age is a derived attribute.
29
Types of Attributes (2)
• Single-valued
– having only one value rather than a set of values.
– E.g., PlaceOfBirth–single string value.
• Multi-valued
– having a set of values rather than a single value.
– E.g., CoursesEnrolled attribute for student
– EmailAddress attribute for student
– PreviousDegree attribute for student.
• Attributes can be:
– simple single-valued, simple multi-valued,
– composite single-valued or composite multi-valued.
30
ER Diagram: Notations
Multi-valued
mname attribute Key
attribute
fname lname

Roll_No
name Program
Composite student
attribute sex Admissio
n_yr
age email
dob
Derived
attribute 31
Domains of Attributes
• Each attribute takes values from a set called its domain
• For example,
– StudentAge–{17,18, …, 55}
– HomeAddress–character strings of length 35
• Domain of composite attributes –
– cross product of domains of component attributes
• Domain of multi-valued attributes –
– set of subsets of values from the basic domain

32
Entity sets and key attributes
• Key–an attribute or a collection of attributes whose value(s) uniquely identify an
entity in the entity set
• For instance,
– RollNumber- Key for Student entity set
– EmpID- Key for Faculty entity set
– HostelName, RoomNo- Key for Student entity set (assuming that each student
gets to stay in a single room)
• A key for an entity set may have more than one attribute
• An entity set may have more than one key
• Determined by the designers

33
Relationship
When two or more entities are associated with each other, we have an
instance of a relationship
E.g: student Alice enrolls in Discrete Mathematics course

name dob name credit

student Enrolls course

S_id c_id

34
Mathematical Interpretation
 Relationship Enrolls has Student and Course as the participating entity sets
 Formally, Enrolls ⊆ Student × Course
• Operation ‘x’ indicates cross product
 (s,c) ∈ Enrolls ⇔ Student ‘s’ has enrolled in Course ‘c’
 Tuples in Enrolls known as relationship instances
 Enrolls is called a relationship Type/Set

35
Keys of Relationship
• There can only be one relationship for every unique combination of entities
• This also means that the relationship is uniquely determined by the keys of its
entities
• Example: the “key” for Enrolls is {s_id, c_id}

name dob name credit

student Enrolls course

s_id c_id

𝐾𝑒𝑦𝐸𝑛𝑟𝑜𝑙𝑙𝑠 = 𝐾𝑒𝑦𝑠𝑡𝑢𝑑𝑒𝑛𝑡 ∪ 𝐾𝑒𝑦𝑐𝑜𝑢𝑟𝑠𝑒

36
Relationships and Attributes
• Relationships may have attributes as well.

doe
name dob name credit

student Enrolls course

S_id c_id

For example: “doe” or date of enrollment records when a student


enrolled for the course. “doe” is neither an attribute of student
nor course
37
Decision: Relationship vs. Entity?

• Q: What does this say?

date
name category name

price
Product Purchased Person

• A: A person can only buy a specific product once (on one date)

Modeling something as a relationship makes it unique; what if


not appropriate?
38
Decision: Relationship vs. Entity?
• What about this way?
date PID# quantity

name
name category Purchase
price
ProductOf BuyerOf
Product Person

• Now we can have multiple purchases per product, person pair!

We can always use a new entity instead of a relationship. For example,


to permit multiple instances of each entity combination!

39
Binary Relation & Cardinality

m n
E1 R E2

The number of entities from E2 that an entity from E1 can possibly be


associated through R (and vice-versa) determines the cardinality ratio of
R.

Four possibilities-
one to one, one to many, many to one and many to many
40
An entity in A is associated An entity in A is
with at most one entity in associated with any
B, and an entity in B is number (zero or more) of
associated with at most entities in B. An entity in
one entity in A B, however, can be
associated with at most
one entity in A.

One to Many
One to One

Many to Many
Many to One

An entity in A is An entity in A is
associated with at most associated with any
one entity in B. An entity number (zero or more) of
in B, however, can be entities in B, and an entity
associated with any in B is associated with any
number (zero or more) number (zero or more) of
of entities in A. entities in A.
41
One to One

A R B

One to Many

A R B

Many to One

A R B

Many to Many

A R B
42
Example: one to many

date
name category name

price
Product Purchased Person

One person can be associated with atmost one product and


one product can be associated with any (0 or more) number of
persons through Purchased relationship

43
Example: Many to One

date
name category name

price
Product Purchased Person

One product can be associated with atmost one person and


one person can be associated with any (0 or more) number of
products through Purchased relationship

44
name category
name
price
makes Company
Product
stockprice

buys employs

Person

address name ssn

What does this say?


45
Degree of a relationship
• Degree: the number of participating entities
– Degree 2: binary
– Degree 3: ternary
– Degree n: n-ary
Binary relationships are very common and widely used

46
Multi-way Relationship
Modeling a purchase relationship between product, store and buyers

Product Store

Purch
ase

Buyer

47
Multi-way Relationship
What is the meaning of the following relationship?

Product Store

Purch
ase

Buyer

For each unique combination of product and buyer,


there will be atmost one store associated
48
Participation Constraint
• An entity set may participate in a relation either totally or partially
• Total participation: Every entity in the set is involved in some association
(or tuple) of the relationship
• Partial participation: Not all entities in the set are involved in association
(or tuples) of the relationship

total partial
E1 R E2

49
Structural Constraints
• Cardinality Ratio and Participation Constraints are together called Structural Constraints
• They are called constraints as the data must satisfy them to be consistent with the
requirements
• Min-Max notation: pair of numbers (m,n) placed on the line connecting an entity to the
relationship
• m: the minimum number of times a particular entity must appear in the relationship
tuples at any point of time
– 0 –partial participation
– ≥1 –total participation
• n: similarly, the maximum number of times a particular entity can appear in the
relationship tuples at any point of time
(1,1) (0,n)
E1 R E2
50
Recursive relationship and Role Name
• Recursive relationship: An entity set relating to itself gives rise to a
recursive relationship
– E.g., the relationship prereqOf is an example of a recursive
relationship on the entity Course
• Role Names –used to specify the exact role in which the entity participates
in the relationships
• Role Names are essential in case of recursive relationships

prerequisite
Course prereqOf

course
Role Names 51
Weak Entity Set
• Weak Entity Set: An entity set whose members owe their existence to some entity in a
strong entity set
– Entities are not of independent existence
– Each weak entity is associated with some entity of the owner entity set through a
special relationship
– Weak entity set may not have a key attribute
– The discriminator (or partial key) of a weak entity set is the set of attributes that
distinguishes among all the entities of a weak entity set
Always
total W
S R

Owner Entity Identifying Weak entity


52
Relationship
Weak Entity Set Example

Payment_no PayDate

Amount

Loan_
Loan_No Loan payme Payment
nt

Amount

53
Modeling Subgroupings
• Some entities in an entity set may be special, i.e. worthy of their own
entity set
Define a new entity set?
• But what if we want to maintain connection to current entity set?

• Better: define a sub-entity set


• Ex:
Products

Software Educational
products products

We can define subgroups in ER Diagram


54
Modeling Subgroups
Child entity sets
name contain all the
attributes of their
price parent entity set
plus the new
Product attributes shown
attached to them in
the ER diagram
isA

Software Product Educational Product

platforms ageGroup
55
Understanding Subgroups
Child subgroups contain all the attributes of all of their parent groups plus the
new attributes shown attached to them in the ER diagram

• Think in terms of records; ex:


name
– Product
price name price
Product
– SoftwareProduct
name price platform
isA
– EducationalProduct
Software Educational
Product Product name price ageGroup
platforms ageGroup

56
Some Design Tips
• Avoid redundancy
• Limit the use of weak entity sets
• Don’t use an entity set when an attribute will do

57
Avoiding Redundancy
• Redundancy = saying the same thing in two (or more) different ways.
• Wastes space and (more importantly) encourages inconsistency.
• Two representations of the same fact become inconsistent if we change one and
forget to change the other.

58
Example 1

Prod_name Mname addr

Product ManufacturedBy Manufacturer

This design gives the address of each manufacturer exactly


once. √

59
Example 2

Prod_name manf Mname addr

Product ManufacturedBy Manufacturer

This design states the manufacturer of a product twice- as


an attribute and as a related entity X

60
Example 3

Prod_name manf Maddr

Product

This design repeats the manufacturer’s address once for


each product and loses the address if there are temporarily
no product for a manufacturer. X
61
Entity Sets vs Attributes
• An entity set should satisfy at least one of the following conditions:
– It is more than the name of something; it has at least one nonkey attribute or
– It is the “many” in a many-one or many-many relationship.

62
Example 4

Prod_name Mname addr

Product ManufacturedBy Manufacturer

Manufacturer deserves to be an entity set because of the


nonkey attribute addr.
Product deserves to be an entity set because it is at the “many”
end of the many-to-one relationship ManufacturedBy. √ 63
Example 5

Prod_name manf

Product

There is no need to make the manufacturer an entity set,


because we record nothing about manufacturers besides
their name √
64
Example 7

Prod_name Mname

Product ManufacturedBy Manufacturer

Since the manufacturer is nothing but a name, and is not at


the “many” end of any relationship, it should not be an
entity set. X
65
Don’t overuse weak entity set
• Beginning database designers often doubt that anything could be a key by itself.
– They make all entity sets weak, supported by all other entity sets to which
they are linked.

• Weak entities cannot exist without their associated strong entities. Overuse
implies a highly coupled schema, which can:
– Reduce flexibility for future modifications.
– Make data migration and separation more challenging.

66
From ER Diagram to Relations
• Entity set -> relation.
• Attributes -> attributes.
• Relationships -> relations whose attributes are only:
– The keys of the connected entity sets.
– Attributes of the relationship itself

67
Entity set to relation

Prod_name manf

Product

Relation: Product(Prod_name, manf)

68
Relationship to relation

name addr Bname publisher

Readers favorite Books

like

like(name, Bname)
favorite(name, Bname)

69
Combining relations
• OK to combine into one relation:
– 1. The relation for an entity-set E
– 2. The relations for many-one relationships of which E is the “many.”
• Example: Readers(name, addr) and Favorite(name, Bname) combine to
make Reader1(name, addr, favBook).

70
Risk with many-many relationships
• Combining Readers with Likes would be a mistake. It leads to redundancy, as:

Name Addr Bname


Rabi Patna The Post Office
Rabi Patna Gitanjali

Redundancy

71

You might also like