0% found this document useful (0 votes)
19 views28 pages

DBMS

Uploaded by

patilsiddeshb16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views28 pages

DBMS

Uploaded by

patilsiddeshb16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

DBMS

What is the data?


 Data is a collection of raw, unorganized facts and details like text, observations, figures,
symbols, and descriptions of things etc.
In other words, data does not carry any specific meaning and has no significance
by itself.
Moreover, data is measured in terms of bits and bytes – which are basic units of
information in the
context of computer storage and processing.
 Data can be recorded and doesn’t have any meaning unless processed.

What is Information?
 Info. Is processed, organized, and structured data.
 It provides context of the data and enables decision making.
 Processed data that make sense to us.
 Information is extracted from the data, by analyzing and interpreting pieces of data.
 E.g.,you have data of all the people living in your locality, its Data, when you analyze and
interpret
i. the data and come to some conclusion that:
ii. There are 100 senior citizens.
iii. The sex ratio is 1.1.
iv. Newborn babies are 100.
v. These are information.

Example of Data: "15"


In this example, "15" is just a number. It doesn't convey any specific meaning on its own.
Example of Information: "15 apples"
In this example, "15 apples" is meaningful information because it provides context to the
number "15." It tells us that we are talking about a quantity of apples, which makes the data
more relevant and useful.

Data vs Information
a) Data is a collection of facts, while information puts those facts into context.
b) While data is raw and unorganized, information is organized.
c) Data points are individual and sometimes unrelated. Information maps out that data to
provide a big-picture view of how it all fits together.
d) Data, on its own, is meaningless. When it’s analyzed and interpreted, it becomes
meaningful information.
e) Data does not depend on information; however, information depends on data.
f) Data typically comes in the form of graphs, numbers, figures, or statistics. Information is
typically presented through words, language, thoughts, and ideas.
g) Data isn’t sufficient for decision-making, but you can make decisions based on
information.
What is Database?
 Database is an electronic place/system where data is stored in a way that it can be easily
accessed,
managed, and updated.
 To make real use Data, we need Database management systems. (DBMS)
What is the DBMS?
1
DBMS is the set programs that act as interface between the user and database which
helps the user to interact with the database(collection of interrelated data), it means user
can access, delete make any modification in the database.
The primary goal of a DBMS is to provide a way to store and retrieve data efficiently

Key Features of DBMS


A Database Management System (DBMS) encompasses essential features crucial for efficient
data management. Let's delve into the key aspects:
1. Data Modeling
DBMS facilitates the creation and modification of data models, defining the structure and
relationships within the database.
2. Data Storage and Retrieval
Responsible for storing and retrieving data, DBMS provides diverse methods for efficient
searching and querying.
3. Concurrency Control
DBMS includes mechanisms ensuring concurrent access, allowing multiple users to interact
with the data simultaneously without conflicts.
4. Data Integrity and Security
Tools within DBMS enforce data integrity and security constraints, validating data values and
implementing access controls.
5. Backup and Recovery
DBMS offers mechanisms for data backup and recovery, safeguarding against system failures
and ensuring business continuity.

Q1. What are the key features of DBMS?


Q2. What are the functions of DBMS? Same as features
Q3. What are the major capabilities of DBMS?
Answer: DBMS usually deals with CRUD – Create Read Update and Delete – operations on
Databases.
Data Storage.
Data Retrieval.
Data Deletion
Data Updation.
Data Security.
Data Independence
Example: University Student Management System
1. Data Redundancy and Inconsistency
The university stores student information in multiple files: `students.txt` (general info),
`courses.txt` (enrolled courses), and `grades.txt` (course grades). Each file includes the
student's name and ID.

2
Problem: When a student changes their address, the update must be applied to every file. If
the address is updated in `students.txt` but not in `courses.txt` or `grades.txt`, the data
becomes inconsistent. This redundancy increases the storage requirement and leads to
inconsistencies.
DBMS Solution: A DBMS stores the student's information in a single table `Students` and
references this table in other tables like `Courses` and `Grades`, eliminating redundancy and
ensuring consistency.
2. Difficulty in Accessing Data
To find all students enrolled in a specific course and their grades, you need to combine
information from `students.txt`, `courses.txt`, and `grades.txt`.
Problem: in a file system, this requires writing a complex program to parse and integrate data
from multiple files.
DBMS Solution: In a DBMS, you can use SQL queries to easily retrieve this information. For
example:
3. Data Isolation
The university wants to generate a report of all courses a student has taken along with their
grades.
Problem: In a file system, student data, course enrollments, and grades are in separate files,
making it difficult to integrate and retrieve combined information.
DBMS Solution: A DBMS allows data to be stored in related tables, making it easy to join these
tables and retrieve integrated information.
4. Integrity Problems
Scenario: Each student must have a unique ID, and grades must be within a specific range (0-
100).
Problem: A file system cannot enforce these constraints, leading to potential integrity issues
like duplicate IDs or invalid grades.
DBMS Solution: A DBMS enforces integrity constraints such as primary keys for unique IDs and
check constraints for valid grade ranges.
5. Atomicity Problems
Scenario: A transaction involves registering a student for a course and updating their total
course count.
Problem: In a file system, if the system crashes after updating the course file but before
updating the student file, the data will be inconsistent.
DBMS Solution: A DBMS ensures atomicity of transactions, meaning both updates (course
registration and course count) will be completed together or not at all.
6. Concurrent-Access Anomalies
Scenario: Two administrators simultaneously update a student's course registration and
personal details.
Problem: In a file system, concurrent updates can lead to data loss or corruption if not
properly synchronized.

3
DBMS Solution: A DBMS handles concurrent access using locking mechanisms to prevent
conflicts, ensuring data consistency.
7. Security Problems
Scenario: Sensitive data like student grades and personal information need to be protected.
Problem:A file system offers limited security controls, making it difficult to restrict access to
sensitive data based on user roles.
DBMS Solution: A DBMS provides fine-grained access control, allowing the definition of user
roles and permissions. For example, only authorized personnel can access or modify student
grades.

4
If we have the file system then why we used DBMS over the file System to store
and maintain the data?
DBMS vs File Systems
a. File-processing systems has major disadvantages.
1. Data Redundancy and inconsistency
2. Difficulty in accessing data
3. Data isolation
4. Integrity problems
5. Atomicity problems
6. Concurrent-access anomalies
7. Security problems
b. Above 7 are also the Advantages of DBMS (answer to "Why to use DBMS?")

1. Data Redundancy and inconsistency


In file-processing systems, the same data may be duplicated across multiple files, leading
to redundancy. This redundancy can result in inconsistency if updates are not properly
propagated.
Real-life Example: Consider a company that stores customer information in multiple files for
different purposes, such as sales, billing, and marketing. If a customer changes their
address, updating it in one file but forgetting to update it in others could lead to
inconsistencies in the data.
2. Difficulty in Accessing Data:
File-processing systems often lack efficient mechanisms for accessing data. Users may
need to write custom programs to extract information, which can be time-consuming and
error-prone.
Real-life Example: In a library using a file-based system to manage book records, if a
librarian wants to find all books authored by a particular author, they may need to
manually search through multiple files or records.
3. Data Isolation:
Data isolation refers to the inability to access related data together. In file-processing
systems, data representing to the same entity may be scattered across different files.
Real-life Example: In a university system using file-based storage, student information may
be stored in separate files for personal details, course registrations, and grades. Accessing
all information related to a particular student would require accessing multiple files
separately.
4. Integrity Problems:
an integrity problem occurs when the data stored in a system becomes inaccurate,
inconsistent, or unreliable. It means the data doesn't match what it's supposed to be or
what was intended. Real-life Example: In a banking system using file-based storage for
account information, a programming error or user mistake could lead to an account
balance being incorrectly updated, resulting in financial discrepancies.
5. Atomicity Problems:
Atomicity refers to the concept that operations should either be completed entirely or not
at all. In file-based systems, if an operation involves updating multiple files, failure during
the process could leave the system in an inconsistent state where some parts of the
operation are completed while others are not..
Real-life Example: In an airline reservation system using file-based storage, if a reservation
involves updating both passenger information and seat availability files, a system crash
during the update process could result in a situation where a seat is allocated without
updating the passenger record, leading to inconsistencies.

5
6. Concurrent-Access Anomalies:
File-processing systems may not handle concurrent access to data properly, leading to
anomalies such as lost updates, uncommitted data, or inconsistent retrievals.
Real-life Example: In a shared document management system using file-based storage, if
multiple users attempt to modify the same document simultaneously, their changes may
overwrite each other, leading to lost updates or corrupted files.
7. Security Problems:
File-processing systems may lack robust security features, making it challenging to control
access to sensitive data and protect against unauthorized access or modifications.
Real-life Example: In a healthcare system using file-based storage for patient records, if
files are not adequately secured, unauthorized personnel may gain access to sensitive
medical information, compromising patient privacy and confidentiality.

DBMS Architecture: 2nd video


 The major purpose of DBMS is to provide users with an abstract view of the data. That is,
the system hides certain details of how the data is stored and maintained.
 To simplify user interaction with the system, abstraction is applied through several levels
of abstraction.
 The main objective of three level architecture is to enable multiple users to access the
same data
with a personalized view while storing the underlying data only once

1. Physical level / Internal level


 The lowest level of abstraction describes how the data are stored.
 Low-level data structures used.
 It has Physical schema which describes physical storage structure of DB.
 Talks about: Storage allocation (N-ary tree etc), Data compression & encryption
 Goal: We must define algorithms that allow efficient access to data

2. Logical level / Conceptual level:


 The conceptual schema describes the design of a database at the conceptual level,
describes what data are stored in DB, and what relationships exist among those data.
 User at logical level does not need to be aware about physical-level structures.
 DBA, who must decide what information to keep in the DB use the logical level of
abstraction.
 Goal: ease to use.

3. View level / External level:


 Highest level of abstraction aims to simplify users' interaction with the system by
providing different view to different end-user.
 Each view schema describes the database part that a particular user group is
interested and hides the remaining database from that user group.
6
 At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
 At views also provide a security mechanism to prevent users from accessing certain
parts of DB.
Example: Student Management System
1. Physical Level:
 At the physical level, data is stored on storage devices such as disks or tapes.
 For the Student Management System, data such as student records, course
information, and grades may be stored in a relational database management system
(RDBMS) like MySQL, PostgreSQL, or SQL Server.
 The physical level describes how this data is organized on the disk, including details
such as data file locations, indexes, and data storage formats.
2. Conceptual Level:
 At the conceptual level, data is represented in the form of database tables, defining the
structure of the database without specifying how it is stored physically.
 In our Student Management System, the conceptual schema may include tables such
as:
 Student: Contains information about each student, including their student ID,
name, address, and contact details.
 Course: Stores details about the courses offered by the institution, including
course ID, title, description, and credit hours.
 Enrollment: Represents the relationship between students and the courses they
are enrolled in, containing fields like student ID, course ID, and enrollment
status.
 The conceptual schema defines the relationships between these tables, such as the
one-to-many relationship between Student and Enrollment (a student can be enrolled in
multiple courses).
3. External Level:
 At the external level, views of the data are provided to different user groups based on
their requirements.
 For example, the Student Management System may have different external schemas
for:
 Administrators: Provides access to student records, course management
functionalities, and administrative tasks like adding or removing students and
courses.
 Faculty: Allows instructors to view their course rosters, enter grades, and access
student information relevant to their courses.
 Students: Provides students with access to their personal information, course
schedules, grades, and registration options.
 Each external schema presents a tailored view of the data to meet the specific needs of
the user group, abstracting away the complexities of the underlying database
structure.

By employing the Three Schema Architecture, the Student Management System achieves
modularity, flexibility, and data abstraction, enabling efficient management of student-related
information while providing customized views for different stakeholders within the educational
institution.
Instances and Schemas
 The collection of information stored in the DB at a particular moment is called an
instance of DB.
 The overall design of the DB is called the DB schema.

7
 Schema is structural description of data. Schema doesn’t change frequently. Data
may change
frequently.
 DB schema corresponds to the variable declarations (along with type) in a program.
 We have 3 types of Schemas: Physical, Logical, several view schemas called
subschemas.
 Logical schema is most important in terms of its effect on application programs, as
programmers
construct apps by using logical schema.
 Physical data independence, physical schema change should not affect logical
schema/application programs.

How does the Three Schema Architecture promote data independence?


Ans: The Three Schema Architecture promotes data independence by separating the logical
structure (conceptual schema) from the physical storage details (internal schema) and
providing customized views for users (external schema). This separation allows modifications
to be made at one level without affecting the others, enabling flexibility and adaptability in
database design without disrupting user applications or data access.

Data Models:
 Provides a way to describe the design of a DB at logical level.
 Underlying the structure of the DB is the Data Model; a collection of conceptual tools for
describing data, data relationships, data semantics & consistency constraints.
 E.g., ER model, Relational Model, object-oriented model, object-relational data model etc

Database Languages:
a. Data definition language (DDL) to specify the database schema.
b. Data manipulation language (DML) to express database queries and updates.
C. Practically, both language features are present in a single DB language, e.g., SQL language.
d. DDL: We specify consistency constraints, which must be checked, every time DB is
updated.
e. DML
 Data manipulation involves
o Retrieval of information stored in DB.
o Insertion of new information into DB.
o Deletion of information from the DB.
o Updating existing information stored in DB.
 Query language, a part of DML to specify statement requesting the retrieval of
information.

DDL command: DML commands:

Now one question may come in our mind, that DBMS only understand the query
languages like SQL then how the application written in another language can
access the database?
Ans: application accesses a database by establishing a connection to the database using a
database driver( libraries or interfaces examp: In Node.js, the mysql package is a widely used
library for interacting with MySQL databases.), then interacts with the database using a
database API to execute SQL queries and commands. The application processes the retrieved
data according to its logic, handles errors, and manages database connections efficiently. This
allows the application to retrieve, manipulate, and present data from the database to users,
enabling dynamic and interactive functionality.
8
Here's a brief list of database drivers for different languages:
1. Java: JDBC (MySQL Connector/J, PostgreSQL JDBC Driver, Oracle JDBC Driver)
2. Python: psycopg2 (PostgreSQL), pymysql (MySQL), cx_Oracle (Oracle), pyodbc (SQL
Server, PostgreSQL, MySQL, etc.)
3. Node.js: mysql2 (MySQL), pg (PostgreSQL), mssql (SQL Server)
4. C++: ODBC

Database Administrator (DBA)


 A person who has central control of both the data and the programs that access those
data.
 Functions of DBA
 Schema Definition
 Storage structure and access methods.
 Schema and physical organization modifications.
 Authorization control.
 Routine maintenance
1. Periodic backups.
2. Security patches.
3. Any upgrades.

DBMS Application Architectures: Client machines, on which remote DB users work, and
server machines
on which DB system runs.
a. T1 Architecture: The client, server & DB all present on the same machine.
b. T2 Architecture
1. App is partitioned into 2-components.
2. Client machine, which invokes DB system functionality at server end through query
language statements.
3. API standards like ODBC & JDBC are used to interact between client and server.
c. T3 Architecture
1. App is partitioned into 3 logical components.
2. Client machine is just a frontend and doesn’t contain any direct DB calls.
3. Client machine communicates with App server, and App server communicated with
DB
system to access data.
4. Business logic, what action to take at that condition is in App server itself.
5. T3 architecture are best for WWW Applications.

6. Advantages:
 Scalability due to distributed application servers.
 Data integrity, App server acts as a middle layer between client and DB, which
minimize the chances of data corruption.
 Security, client can't directly access DB, hence it is more secure

9
Data model: Video3
1. Data Model: Collection of conceptual tools for describing data, data relationships,
data semantics, and consistency constraints.
2. ER Model
 It is a high level data model based on a perception of a real world that consists of a
collection of basic objects, called entities and of relationships among these objects.
 Graphical representation of ER Model is ER diagram, which acts as a blueprint of DB.

3. Entity: An Entity is a “thing” or “object” in the real world that is distinguishable from all
other objects.
 It has physical existence.
 Each student in a college is an entity.
 Entity can be uniquely identified. (By a primary attribute, aka Primary Key)
 Strong Entity: Can be uniquely identified.
 Weak Entity: Can’t be uniquely identified., depends on some other strong entity.
o It doesn’t have sufficient attributes, to select a uniquely identifiable attribute.
o Loan -> Strong Entity, Payment -> Weak, as instalments are sequential number
counter can be generated separate for each loan.
o Weak entity depends on strong entity for existence.
4. Entity set
It is a set of entities of the same type that share the same properties, or attributes.
E.g., Student is an entity set.
E.g., Customer of a bank

5. Attributes
 An entity is represented by a set of attributes.
 Each entity has a value for each of its attributes.
 For each attribute, there is a set of permitted values, called the domain, or value set,
of that attribute.
 E.g., Student Entity has following attributes
Student_ID , Name, Standard, Course, Batch, Contact number, Address
Types of Attributes
1. Simple
 Attributes which can’t be divided further.
 E.g., Customer’s account number in a bank, Student’s Roll number etc.
2. Composite
 Can be divided into subparts (that is, other attributes).
 E.g., Name of a person, can be divided into first-name, middle-name, last-name.
 If user wants to refer to an entire attribute or to only a component of the attribute.
 Address can also be divided, street, city, state, PIN code.
3. Single-valued
10
 Only one value attribute.
 e.g., Student ID, loan-number for a loan.
4. Multi-valued
 Attribute having more than one value.
 e.g., phone-number, nominee-name on some insurance, dependent-name etc.
 Limit constraint may be applied, upper or lower limits.
5. Derived
 Value of this type of attribute can be derived from the value of other related
attributes.
 e.g., Age, loan-age, membership-period etc.
6. NULL Value
 An attribute takes a null value when an entity does not have a value for it.
 It may indicate “not applicable”, value doesn’t exist. e.g., person having no middle-
name
 It may indicate “unknown”.
o Unknown can indicate missing entry, e.g., name value of a customer is NULL,
means it is missing as name
o must have some value.
o Not known, salary attribute value of an employee is null, means it is not
known yet.

See the above figure:


Simple attribute: city, firstname, lastname, surname, state, zip no, etc
Composite attribute: address, streat, name
Multivalued attribute: phone no
Single valued: customer_id
Derived attribute: age

11
Strong entity and weak entity:
Strong Entity
A strong entity in the schema is independent of all other entities. There will always be
a primary key for a strong entity. A strong entity set is a set that is made up of many strong
entities.
Representation:
 A single rectangle is used to represent strong entities.
 A single diamond is used to represent the relationship between two strong entities.

In the above image, we have two strong entities namely Employee and Department hence
they are represented using a single rectangle. The relationship between them is works in i.e
it gives information about an employee working in a particular department hence it is
represented using a single diamond. In the above image, if we remove the relationship
between the two entities then also the two entities will exist i.e Employee as well
as Department will exist since they both are independent of each other, this explains the
independent nature of strong entities.

Weak Entity
A weak entity in DBMS is an entity whose existence depends on other strong entities and
it does not have a primary key of its own.
Representation:
 A double rectangle is used to represent weak entities.
 A double diamond is used to represent the relationship between two weak entities.
Example: In the context of a customer relationship management system, an "Address" entity
can be considered a weak entity because it does not have a unique identifier on its own. It
relies on the existence of a "Customer" entity to which it is associated. An address can be
uniquely identified only in combination with its associated customer. Therefore, the
"Customer" entity serves as the identifying or owner entity for the "Address" entity.

12
Examples:
1. One-to-One (1:1):
 Example: Employee and EmployeeID
 Each employee has exactly one employee ID, and each employee ID is associated
with only one employee.
2. One-to-Many (1:N):
 Example: Department and Employee
 Each department can have multiple employees, but each employee belongs to only
one department.
3. Many-to-One (N:1):
 Example: Employee and Manager
 Many employees can report to the same manager, but each employee has only one
manager.
4. Many-to-Many (N:M):
 Example: Student and Course
 Each student can enroll in multiple courses, and each course can have multiple
students enrolled.
Participations constraints:

13
The total participation constraint here is between the Borrow relationship and the Loan entity.
It specifies that every loan entity must participate in the Borrow relationship, meaning every
loan must be associated with at least one customer through the Borrow relationship.
If total participation is enforced:
 Every loan entity must be connected to at least one customer entity through the Borrow
relationship.
 It ensures that there are no loans in the system that exist without being borrowed by any
customer.
Graphically, in an Entity-Relationship Diagram (ERD), total participation constraints are
typically represented by a double line connecting the relationship to the entity, indicating that
the participation is total.
Note: in the above example cust.. is partially participated : means there may be customer
without loan.
 Weak entities has always total participation constraints, but strong entities may not
have total participation.

Extended ER Features:
1. Specialisation
2. Gener
3.
1. Basic ER Features studied in the LEC-3, can be used to model most DB features but
when complexity increases, it is better to use some Extended ER features to model the DB
Schema.
2. Specialisation
 In ER model, we may require to subgroup an entity set into other entity sets that are
distinct in some way with other entity sets.
 Specialisation is splitting up the entity set into further sub entity sets on the basis
of their functionalities, specialities and features.

14
 It is a Top-Down approach.
 e.g., Person entity set can be divided into customer, student, employee. Person is
superclass and other specialised entity sets are subclasses.
1. We have "is-a" relationship between superclass and subclass.
2. Depicted by triangle component
 Why Specialisation?
3. Certain attributes may only be applicable to a few entities of the parent entity
set.
4. DB designer can show the distinctive features of the sub entities.
5. To group such entities we apply Specialisation, to overall refine the DB blueprint.

In this picture, we have a entity with some attributes ,


but the attributes like salary, cust_id, profile pic does
these belongs to the same person or different..

It creates the confusion, if the person is customer then


salary is useless attribute for that, but if the person is
employee then salary is one of the useful attribute for
the employee.

This is where Specialisation coms in picture. We can


split the person entity to the customer and the
employee

Here we just break down the person into two entities customer and the employee and also divide the attributes,

the attributes which belongs to both customer and employee we keep them as the attributes of the person, as we
know that sub-emtities can inherit the attributes of superentity

more examples:

15
2. Generalisation
1. It is just a reverse of Specialisation.
2. DB Designer, may encounter certain properties of two entities are overlapping.
Designer may consider to make a new generalised entity set. That generalised entity
set will be a super class.
3. “is-a” relationship is present between subclass and super class.
4. e.g., Car, Jeep and Bus all have some common attributes, to avoid data repetition for
the common attributes. DB designer may consider to Generalise to a new entity set
“Vehicle”.
5. It is a Bottom-up approach.
6. Why Generalisation?
 Makes DB more refined and simpler.
 Common attributes are not repeated.
Note: ER-diagram will be same as specialization, but the difference is here we start thinking
from the bottom to up, for ex first we will thing about customer and employee and if both
entity have some common attribute after then we think about the person entities and assign
those common attributes to the person entity
Attribute Inheritance
1. Both Specialisation and Generalisation, has attribute inheritance.
2. The attributes of higher level entity sets are inherited by lower level entity sets.
3. E.g., Customer & Employee inherit the attributes of Person.
Participation Inheritance
1. If a parent entity set participates in a relationship then its child entity sets will also
participate in that relationship.

3 Aggregation :
 How to show relationships among relationships? - Aggregation is the technique.
 Abstraction is applied to treat relationships as higher-level entities. We can call it
Abstract entity.
 Avoid redundancy by aggregating relationship as an entity set itself

there is a one limitation with E-R model that it cannot express relationships among
relationships. So aggregation is an abstraction through which relationship is treated
as higher level entities.

• Basic E-R model can’t represent relationships involving other relationships


• Example: employee jobs

Want to assign a manager to each (employee,


branch, job) combination
– Need a separate manager entity-set
16
– Relationship between each manager,
employee, branch, and job entity
Redundant Relationships
One option: a quaternary relationship
– This option has lots of redundant information
– Benefit is that some jobs might not require a manager
• Could also make works_on a quaternary relationship
– Don’t use a separate manager relation
– Jobs with no manager would use null values instead
• These options are clumsy

Another option is to treat works_on relationship


as an aggregate
 Here we can treat the works_on relation as a
entity(i.e. works_on entity) then make add the
relation to
it

More example of agrregation:

We want to has relationship only for


those student who enrolled into the
semester. For that purpose we can
aggregate the student attend
semester (aggregate entity)
relationship and the join the has
relationship to the aggregate
relationship --- this is called
aggregation

Note:
A supertype entity is a data model entity which has one or more other entities that act as
subtypes. In a supertype/subtype entity structure: The top-level entity is referred to as the
parent entity or the supertype entity. Each lower-level entity is referred to as a child entity or

a subtype entity

17
Q.1: What is the purpose of the Generalization?
Answer:
Generalization is simply gathering the common properties from entities and creating a
generalized concept from those extracted data. Generalization helps in improving the
flexibility, and reusability of the database.
Q.2: Why is generalization important in the database?
Answer:
Generalization is important in the database because it helps to gather important information
so that it becomes easier and faster for the user the analysis of data and it also helps in
making decisions faster.
Q.3. What does it mean to generalize/specialize an object in an ER diagram?
Answer:
Generalization is the process of creating a more general object from a more specific object. In
an ER diagram, this is represented by an arrow going from the more specific object to the
more general object. Specialization is the process of creating a more specific object from a
more general object. In an ER diagram, this is represented by an arrow going from the more
general object to the more specific object.

How to think and Formulate the ER diagram: lecture 5:


1. Identify the entity set (all entities)
2. Identify attributes and their types
3. Identify relation and constraints (Mapping / participation)
4. Create ERD
Lets build the ER model for the Banking System:
Before creating the ER diagram we will first collect the DB requirements
1. Banking system ----> Branches (name-PK)
2. Bank ----> Customer
3. Customer ----> Accounts & take loan
4. Customer associated with some bankers
5. Bank has employee
6. Bank Account---- saving account and current account
7. Loan originated by branch ----> (multiple customer can take loan)---> payment
schedule
Step1: entity set

Customer Branch Loan


Employee payment
Saving account
Current account
ch

Step2: indetify attribute:

18
Step3: relation and constraints

19
Step4: ER-Diagram

20
Facebook ER-diagram:

After designing the conceptual model of the Database using ER diagram, we need to convert
the conceptual model into a relational model which can be implemented using
any RDBMS language like Oracle SQL, MySQL, etc. So we will see what the Relational Model is.
The relational model uses a collection of tables to represent both data and the relationships
among those data. Each table has multiple columns, and each column has a unique name.
Tables are also known as relations. The relational model is an example of a record-based
model. Record-based models are so named because the database is structured in fixed-
format records of several types. Each table contains records of a particular type. Each record
type defines a fixed number of fields, or attributes. The columns of the table correspond to
the attributes of the record type. The relational data model is the most widely used data
model, and a vast majority of current database systems are based on the relational model.

Relational Model:
The relational model represents how data is stored in Relational Databases. A relational
database consists of a collection of tables, each of which is assigned a unique name. Consider
a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in
the table

21
Important Terminologies:
1. Relational Model (RM) organises the data in the form of relations (tables).
2. A relational DB consists of collection of tables, each of which is assigned a unique name.
3. A row in a table represents a relationship among a set of values, and table is collection of
such relationships.
4. Tuple: A single row of the table representing a single data point / a unique record.
5. Columns: represents the attributes of the relation. Each attribute, there is a permitted
value, called domain of the attribute.
6. Relation Schema: defines the design and structure of the relation, contains the name of
the relation and all the columns/attributes.
7. Common RM based DBMS systems, aka RDBMS: Oracle, IBM, MySQL, MS Access.
8. Degree of table: number of attributes/columns in a given table/relation.
9. Cardinality: Total no. of tuples in a given relation.
10.Relational Key: Set of attributes which can uniquely identify an each tuple.
11.Relation Instance: The set of tuples of a relation at a particular instance of time is called
a relation instance. Table 1 shows the relation instance of STUDENT at a particular time. It
can change whenever there is an insertion, deletion, or update in the database.
12.NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.

13.Important properties of a Table in Relational Model:


a) The name of relation is distinct among all other relation.
b) The values have to be atomic. Can’t be broken down further.
c) The name of each attribute/column must be unique.
d) Each tuple must be unique in a table.
e) The sequence of row and column has no significance.
f) Tables must follow integrity constraints - it helps to maintain data consistency across
the tables.

14.Relation Key: These are basically the keys that are used to identify the rows uniquely or
also help in identifying tables. These are of the following types.
 Super Key
 Candidate Key
 Primary Key
 Alternate Key
 Composite Key
 Compound key
 Surrogate Key:

1. Super Key (SK): Any P&C of attributes present in a table which can uniquely identify
each tuple.

22
2. Candidate Key (CK): minimum subset of super keys, which can uniquely identify each
tuple. It contains no redundant attribute.
 CK value shouldn’t be NULL.
3. Primary Key (PK): Selected out of CK set, has the least no. of attributes.
4. Alternate Key (AK): All CK except PK.
5. Foreign Key (FK):
 It creates relation between two tables.
 A relation, say r1, may include among its attributes the PK of an other relation, say
r2. This attribute is called FK from r1 referencing r2.
 The relation r1 is aka Referencing (Child) relation of the FK dependency, and r2
is called Referenced (Parent) relation of the FK.
 FK helps to cross reference between two different relations.
6. Composite Key: PK formed using at least 2 attributes.
7. Compound Key: PK which is formed using 2 FK.
8. Surrogate Key:
 Synthetic PK.
 Generated automatically by DB, usually an integer value.
 May be used as PK.

Lets say this is the table we have.

Surrogate key:

23
In the above figure let’s say we have the database of two school. We want to merge the both
tables of school A and school B. But the problem is we can not use register no as the primary
key because they are different in format, so to identify the merged table uniquely database
ads surrogate key to table.

Integrity Constraints:

 Integrity constraints are a set of rules. It is used to maintain the quality of information.

 Integrity constraints ensure that the data insertion, updating, and other processes have
to be performed in such a way that data integrity is not affected.
 Thus, integrity constraint is used to guard against accidental damage to the database.

24
1. Domain constraints
o Domain constraints can be defined as the definition of a valid set of values for an
attribute.
o The data type of domain includes string, character, integer, time, date, currency, etc.
The value of the attribute must be available in the corresponding domain.
o Restricts the value in the attribute of relation, specifies the Domain.
o Restrict the Data types of every attribute.
o E.g., We want to specify that the enrolment should happen for candidate birth year <
2002

Example:

2. Entity integrity constraints


o The entity integrity constraint states that primary key value can't be null.
o This is because the primary key value is used to identify individual rows in relation and
if the primary key has a null value, then we can't identify those rows.
o A table can contain a null value other than the primary key field.
o Every relation should have PK. PK != NULL.

Example:

3. Referential Integrity Constraints


o A referential integrity constraint is specified between two tables.
o In the Referential integrity constraints, if a foreign key in Table 1 refers to the Primary
Key of Table 2, then every value of the Foreign Key in Table 1 must be null or be
available in Table 2.
25
Example:

Child table / referencing table

parent table / referenced table


Form of referenced constraints
1. Insert constraints: value can’t be inserted in the child table if the value is not lying in
parent table
2. Delete constraint: value can’t be delete from the parent table if it is present in the
child table.
Ex. In the above table we can’t delete the row/tuple with the D_No = 11 as it is also
present in the child table, Suppose if we delete it , there will be record in the child table,
but it will not referencing to anyone in the parent table, this leads to inconsistency in
the database.
Is there any way that we can delete the entry from the parent table even if there is
value present in the child table without violating delete constraint?
Yes. there are ways to handle this requirement without violating the constraints, including
using cascading deletes, setting foreign key references to NULL, or explicitly deleting related
records from the child table before deleting from the parent table.

1. Cascading Deletes
we can define a foreign key with the ON DELETE CASCADE option. This means that if a record
in the parent table is deleted, all related records in the child table will also be deleted
automatically.

26
In this setup,
deleting a student
will automatically
delete all associated
grades.

2. Set Null on Delete => (question may ask: can foreign key have null value?)
You can define a foreign key with the ON DELETE SET NULL option. This means that if a record
in the parent table is deleted, the foreign key field in the child table will be set to NULL.

In this setup, deleting


a student will set the
StudentID in the
Grades table to NULL.

4. Key constraints

o Keys are the entity set that is used to identify an entity within its entity set uniquely.

27
o Key constraints are rules applied to a table's columns to enforce the uniqueness and
validity of data within a database.
o An entity set can have multiple keys, but out of which one key will be the primary key.
A primary key can contain a unique and null value in the relational table.

Example:

Form of the key constraints:

28

You might also like