0% found this document useful (0 votes)
86 views

Dbms 1-4 Unit Notes

1. Data is represented using characters, digits, and special characters to form data items that are collected into records. Records with related data items are organized into files or tables. 2. A database is a centralized collection of data organized using a database management system. It provides features like security, integrity, sharing and recovery of data. 3. Database administrators manage databases and ensure security and backups. They provide support to various users including end users, analysts, designers, programmers and casual users.

Uploaded by

ayushdevgan26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views

Dbms 1-4 Unit Notes

1. Data is represented using characters, digits, and special characters to form data items that are collected into records. Records with related data items are organized into files or tables. 2. A database is a centralized collection of data organized using a database management system. It provides features like security, integrity, sharing and recovery of data. 3. Database administrators manage databases and ensure security and backups. They provide support to various users including end users, analysts, designers, programmers and casual users.

Uploaded by

ayushdevgan26
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

UNIT-1

DATA
Data can be defined as the representation of facts, concepts or instructions in a
formulized manner, suitable for communication, interpretation or processing by
human or electronic machine.
Data is represented with the help of character like alphabets (A to Z, a to z), digits (0-
9) or special character ( + , = * < > / \ ).

RECORD
Record is a collection of related data items.
For example- payroll record for an employee contain such data fields like name, age,
qualifications, DA, HRA, PF.

DATA ITEMS/ DATA FIELDS


A set of character which are used together to represent a specific data element.
ROLL NO. NAME MARKS Data items
101 X 95
102 Y 75 Record
103 Z 88

INFORMATION
Information is processed data that is meaningful and relevant. It results from
organizing, analysing, and interpreting data to provide context and value.
Example: In the university database, information might include the average score of
all BCA101 students, which is derived from data by performing calculations on
exam scores.

CHARACTERISTICS OF INFORMATION
1. Timely
2. Accurate
3. Complete
4. Given to the right person
5. Purpose

FILES
Files are collections of related records. They are used to store and manage data in a
structured manner within a database system. Files can be thought of as tables in a
database, each containing a set of records.
Example: In the university database, you might have files or tables for students,
courses, instructors, and grades. The "Students" file would contain records of all
students, while the "Courses" file would store information about various courses
offered.

DATABASE
The database approach involves the use of a Database Management System (DBMS) to
manage and store data. In this approach, data is organized into a centralized repository with
structured relationships, ensuring data consistency, security, and integrity.

CHARACTERISTICS OF DATABASE

1. Data Centralization: Data is stored in a central location, promoting data


consistencyand reducing redundancy.
2. Data Independence: Programs and data are independent of each other, allowing for
changes to one without affecting the other.
3. Data Security: DBMSs offer security features like user authentication
andauthorization to protect data.
4. Data Integrity: Data integrity rules are enforced to maintain the accuracy and
consistency of data.
5. Data Sharing : Data can be shared among multiple users and applications.
6. Concurrent Access : DBMSs support concurrent access by multiple users, ensuring
data availability.
7. Data Recovery: Backup and recovery mechanisms are in place to safeguard against
data loss.
8. Query Language: A query language (e.g., SQL) allows users to retrieve
andmanipulate data easily.

Centralization

Query Independence
Language

CHARACTERISTICS
Recovery Security
OF DATABASE

Concurrent Integrity
access
Sharing
ADVANTAGES OF DATABASE SYSTEM
1. Data Centralization: All data is stored in one central location, making it easier
to manage and access data efficiently.
2. Data Consistency: Database systems enforce data integrity constraints, ensuring that
data remains accurate and consistent.
3. Data Security: DBMS provides security features like authentication and
authorization to control who can access and modify data.
4. Data Sharing: Multiple users and applications can access and share data
simultaneously, promoting data integration and collaboration.
5. Data Independence: Changes to the database structure can be made without
affecting the application programs that use the data (data and program
independence).
6. Concurrent Access: DBMS handles concurrent access by multiple users, ensuring
dataavailability and consistency.
7. Backup and Recovery: Robust backup and recovery mechanisms are in place
to prevent data loss in case of system failures.
8. Query Language: A query language (e.g., SQL) allows users to retrieve,
manipulate, and analyse data easily.
9. Data Relationships: Database systems support the establishment of relationships
between data in different tables, facilitating complex queries and data modelling.
10. Scalability: DBMS can be scaled to accommodate growing data and user loads.

DISADVANTAGES OF DATABASE SYSTEM


1. Cost: Implementing and maintaining a database system can be expensive due to
licensing fees, hardware costs, and personnel training.
2. Complexity: Database design, administration, and optimization can be complex and
require specialized skills.
3. Performance Overhead: Database systems can introduce performance overhead due to
query processing and data management.
4. Data Security Risks: Despite security features, database systems can be vulnerable to
security breaches if not properly configured and maintained.
5. Data Redundancy: While database systems aim to reduce data redundancy, it's still
possible to have some redundancy in certain situations.
6. Lack of Flexibility: Some database systems may be less flexible when it comes to
handling unstructured or rapidly changing data.
7. Initial Setup Time: Setting up a database system can take time and effort, which may
delay the development of applications.
8. Vendor Lock-In: Organizations may become dependent on a specific database
vendor, making it challenging to switch to a different system.
9. Resource Consumption: Database systems can consume a significant amount of
system resources, which may impact overall system performance.
10. Data Loss Risk: Despite backup mechanisms, there is always a risk of data loss in
catastrophic events like hardware failures or natural disasters.
1. DATABASE ADMINISTRATOR ( DBA )
 DBA is a person or a group of person responsible for overall control of databases.
 DBA will create a new account id and password for the user.
 DBA is also responsible for providing security to the database and he allows only the
authorised users.
 DBA also monitors the recovery and backup and provide technical support.
 DBA repairs damage caused due to hardware and software failures.
2. PARAMETRIC END USERS
 Parametric end user are the persons who do not have any DBMS knowledge but they
frequently use the database applications in their daily life to get the desired result.
 Example - Railway ticket booking users and clerk in the bank is a parametric user
because they don’t have any knowledge of DBMS but they still use the database and
perform their given task.

3. SYSTEM ANALYST
 System Analyst is a user who analysis the requirement of parametric end users.
 They check whether all the requirements of end users are satisfied.

4. SOPHISTICATED USERS
 Sophisticated users can be engineer, scientist, business analyst who are familiar with
the database.
 They can develop their database application according to their requirement.

5. DATABASE DESIGNER
 Database designers are the users who design the structure of the database which
includes tables, indexes stored procedures.
6. APPLICATION PROGRAMMERS
Application programmers also referred as system analyst or simply software engineer
are the back- end programmers who write the code for the applications programs.

7. CASUAL USERS
Casual users and Temporary users are the users who use the database but when they
access the database they require the information.
Example - middle or higher level manager.

DATABASE MANAGEMENT SYSTEM ( DBMS )

DBMS is a software which is used to manage the database.


Example- MySQL
DBMS provides an interface to perform various operation like database creation,
storing data, updating data and creating a table in the database.
It provides protection and security to the database.
ADVANTAGES OF DBMS
1. Data Centralization: DBMS centralizes data storage, allowing all data to be stored
in one location. This simplifies data management by eliminating the need for
multiple,scattered data files.
2. Data Integrity: DBMS enforces data integrity constraints, ensuring that data is
accurate and consistent. This helps prevent errors and inconsistencies in the data.
3. Data Security: DBMS provides robust security features, including user
authentication and authorization, to control access to data. It helps protect sensitive
information from unauthorized access.
4. Concurrent Access: Multiple users or applications can access and manipulate data
simultaneously without conflicts or data corruption, thanks to the concurrency
control mechanisms of DBMS.
5. Data Sharing: DBMS allows multiple users and applications to share data
easily, facilitating collaboration and information exchange within an
organization.
6. Data Independence: Changes to the database schema can be made without affecting
the application programs that use the data. This data independence simplifies database
maintenance and upgrades.
7. Query Language: DBMS provides a query language (e.g., SQL) that allows users
to retrieve, manipulate, and analyse data efficiently. This simplifies data retrieval
andreporting.
8. Data Relationships: DBMS supports the establishment of relationships between data
in different tables, enabling complex queries and data modelling.
9. Scalability: DBMS systems can be scaled up to accommodate growing data and user
loads, making them suitable for both small and large-scale applications.
10. Backup and Recovery: DBMS systems offer backup and recovery mechanisms,
allowing organizations to recover data in case of accidental deletion, system failures,
or disasters.

DISADVANTAGES OF DBMS
1. Complexity: DBMS systems can be complex to set up and manage.
Database administrators require specialized training and expertise to ensure
optimal performance and security.
2. Cost: Implementing and maintaining a DBMS can be expensive. Costs include
software licenses, hardware, personnel training, and ongoing maintenance expenses.
3. Performance Overhead: DBMS systems introduce performance overhead due to
query processing, indexing, and data management. In some cases, complex queries
may take longer to execute compared to simpler file-based systems.
4. Data Size: Large databases may require significant storage resources and can lead to
increased hardware costs. Additionally, managing and backing up large volumes of
data can be time-consuming.
5. Complex Backup and Recovery: While DBMS systems offer backup and
recovery mechanisms, implementing and managing them effectively can be
complex. This complexity may lead to data loss if not properly configured.
6. Vendor Lock-In: Organizations may become locked into a specific DBMS
vendor's technology, making it challenging to switch to a different system in the
future.
7. Resource Consumption: Database systems can consume significant
system resources, including CPU, memory, and storage. This may affect the
overall performance of the host system.
8. Data Redundancy: While DBMS systems aim to reduce data redundancy, it is still
possible to have some redundancy in certain situations, leading to increased storage
requirements.
9. Data Security Risks: While DBMS systems offer security features, they can still be
vulnerable to security breaches if not properly configured and maintained. Security
risks include unauthorized access and data breaches.
10. Data Isolation: In some cases, data in a DBMS may be isolated and not easily
shared or integrated with other systems, leading to data silos.

FILE BASED APPROACH / TRADITIONAL FILE SYSTEM

 File based system were an early attempt to computerize the manual system.
 It is also called Traditional based approach in which a decentralized approach is
taken where each department stored and control its own data with the help of data
processing.

LIMITATIONS OF FILE BASED APPROACH


1. Data Redundancy: As mentioned earlier, data redundancy is a major issue in
file-based systems, which can lead to inconsistencies and inefficiencies.
2. Data Inconsistency: Inconsistent data can arise when the same data is stored in
multiple files, making it difficult to update all occurrences uniformly.
3. Program-Data Dependence: Changes to data structures or file formats often require
modifications to multiple programs, making the system inflexible and hard to
maintain.
4. Data Isolation: Data isolation occurs when data in one file cannot be easily shared or
accessed by other parts of the system or different programs.
5. Lack of Data Integrity: Ensuring data integrity, such as enforcing referential
integrityand data validation rules, is challenging in file-based systems.
DBMS LANGUAGES

Database Management Systems (DBMS) use different categories of languages to interact


with and manipulate data. These languages are classified into four main categories: DDL,
DML, DCL, and TCL. Here's an overview of each:

Data Definition Language (DDL):


Purpose: DDL is used to define and manage the structure or schema of the database.
Commands:
CREATE: Used to create database objects such as tables, views, indexes,
andschemas.
ALTER: Allows modification of existing database objects, like adding or
dropping columns from a table.
DROP: Used to delete database objects like tables, views, and indexes.
TRUNCATE: Removes all data from a table but retains the table structure.
COMMENT: Adds comments or descriptions to database objects.

Example: CREATE TABLE, ALTER TABLE, DROP INDEX, etc.

Data Manipulation Language (DML):


Purpose: DML is used to manipulate or interact with the data stored within the database.
Commands:
SELECT: Retrieves data from one or more tables.
INSERT: Adds new rows or records to a table.
UPDATE: Modifies existing records in a table.
DELETE: Removes records from a table.

Example: SELECT * FROM, INSERT INTO, UPDATE, DELETE FROM, etc.

Data Control Language (DCL):


Purpose: DCL is used to control access and permissions to the data and database objects.
Commands:
GRANT: Gives specific privileges to users or roles, allowing them to access
certain data or perform actions.
REVOKE: Removes privileges from users or roles, restricting their access or actions.

Example: GRANT SELECT ON, REVOKE INSERT, UPDATE ON, etc.

Transaction Control Language (TCL):


Purpose: TCL is used to manage transactions within the database.
Commands:
COMMIT: Saves all changes made during the current transaction, making
them permanent.
ROLLBACK: Undoes all changes made during the current transaction, restoring
thedatabase to its previous state.
SAVEPOINT: Sets a point within a transaction to which you can later roll back.

Example: COMMIT, ROLLBACK, SAVEPOINT, etc.


These language categories are essential for users and applications to interact with the database
effectively. DDL defines the database structure, DML is used to manipulate data, DCL
manages access and permissions, and TCL ensures data consistency and integrity within
transactions.
Roles in the Database Environment: Data and Database
Administrator, Database Designers, Applications Developers and
Users
In the database environment, various roles and responsibilities exist to ensure the efficient
and effective management of data and databases. Here are some key rolesand their respective
responsibilities:

1. Database Administrator (DBA):


 Responsibilities :
 Database Management
: Installing, configuring, and maintainingthe
database management system (DBMS).
 Security: Implementing and managing data security measures,
including user access control, authentication, and authorization.
 Backup and Recovery: Creating and maintaining backup and
recovery plans to protect data in case of failures.
 Performance Tuning: Monitoring and optimizing database
performance to ensure efficient data access.
 Data Maintenance: Managing data integrity, consistency, and
accuracy.
 Troubleshooting: Resolving database-related issues and
providing technical support.
 Capacity Planning: Planning for database growth and ensuring
adequate resources.
2. Database Designers:
 Responsibilities :
: Designing the logical and physical
 Database Modeling
structure of the database, including tables, relationships, and
constraints.
 Normalization: Ensuring data is organized efficiently and avoids
redundancy through normalization techniques.
 Schema Design: Creating database schemas that define the
structure of tables and their attributes.
 Data Dictionary: Defining metadata, such as data types and
constraints, for database objects.
 Performance Optimization: Ensuring the database design
supports efficient data retrieval and manipulation.
3. Application Developers:
 Responsibilities :
: Creating software applications that
 Application Development
interact with the database using SQL queries or APIs.
 Data Integration: Integrating database functionality into
applications to support data-driven processes.
 Query Optimization: Writing efficient SQL queries and
optimizing data retrieval and manipulation.
 Data Validation: Implementing data validation rules and error
handling in applications.
 Security: Ensuring secure access to the database within the
applications.
 Testing: Performing thorough testing of applications to ensuredata
accuracy and reliability.
4. Users:
Responsibilities:
 Data Entry: Inputting data into the database through user
interfaces or applications.
 Data Retrieval: Querying the database to retrieve information
needed for decision-making or reporting.
 Data Analysis: Analyzing data to extract insights and support
business processes.
 Reporting: Generating reports and summaries from the
database data.
 Data Maintenance: Keeping data accurate and up-to-date
through updates and corrections.
 Compliance: Adhering to data access and security policies setby
administrators.

These roles work collaboratively to ensure that data is effectively managed, secured,and
made available to support the needs of an organization. Effective communication and
coordination among these roles are essential for the successfuloperation of the database
environment.
UNIT-II

 Database System Architecture:


Database system architecture refers to the overall structure and organization of a database
management system (DBMS), including its components, processes, and relationships. A well-
designed architecture is crucial for efficient storage, retrieval, and management of data in a
database.

Diagram of DBMS Architecture

EXTERNAL SCHEMA EXTERNAL SCHEMA EXTERNAL SCHEMA View Level

Conceptual Schema Logical Level

Physical Schema Physical Level


External Level:-
An external schema defines how data appears to specific users or applications, specifying the view
of the database relevant to their needs and hiding unnecessary details.

 An external schema provides a customized and simplified view of the database for a
particular group of users or applications.
 It acts as a layer of abstraction that hides the complexity of the underlying database schema
 The external schema is designed to meet the requirements of a specific group of users or
applications.

Conceptual Level:-
A conceptual schema is a high-level, abstract representation of the entire database, describing
entities, their relationships, and constraints, providing a clear, conceptual understanding of the
data organization.

 All entitles, their attributes and their relationship.


 Security and integrity rules.
 Segmenting information about the data.
 Validation checks to retain data consistency and integrity.

Physical Level:-
The physical schema, or physical level, represents how data is stored, indexed, and organized at
the lowest level, focusing on disk storage, file structures, access paths, and hardware
considerations within a database system.

 The physical schema describes how data is physically stored on the underlying storage
devices such as hard drives, SSDs, or other storage media
 It defines the actual access paths and mechanisms used to retrieve and manipulate data
efficiently.
 The physical schema is closely tied to the hardware and operating system of the underlying
computing environment.
Data Independence:-
Data independence is the ability to modify the database schema at one level (e.g., logical) without
affecting the schema, applications, or processes at another level (e.g., physical), enhancing system
adaptability, flexibility, and maintenance.

There are two types of Data Independence:-

1. Physical Data Independence


2. Logical Data Independence

Data Independence

Physical Data Logical Data


Independence Independence

1. Physical Data Independence:

Physical data independence refers to the capacity to alter the physical storage and access
mechanisms without affecting the conceptual or external schemas. This is essential for
efficiency, as changes at the physical level can be made transparently to applications and
higher-level schemas.
Key points include:

Storage Modifications: Changes in storage structures, indexing, or storage


devices can occur without impacting the way data is viewed or accessed at higher
levels.

Performance Optimization: Adjustments for performance enhancements, such


as optimizing disk I/O or utilizing new storage technologies can be made without
altering the logical or external views.

Database Restructuring: The database can be reorganized for improved


performance, scalability, or resource utilization without necessitating modifications
in the application programs or the conceptual schema.
2. Logical Data Independence:-Logical data independence refers to the ability to
make changes to the logical schema (e.g., table structure, relationships) of a database
without affecting the external schema or applications, preserving data integrity and
application functionality. Key points included:

Schema Modification Isolated: Changes in the logical schema, like adding or


modifying tables or relationships, do not impact the external schema or application
programs.

Applications Remain Unaffected: External programs querying the database


can continue to function without modifications despite changes in the logical
schema, ensuring application stability.

Enhances Adaptability and Maintenance: Allows for database evolution and


maintenance, facilitating system upgrades or improvements without disrupting the
external view of the data.

 Difference between Physical Data Independence and Logical Data


Independence:-

Aspect Physical Data Logical Data


Independence Independence

Definition Ability to modify physical storage Ability to modify the logical


structures without affecting the schema (e.g., tables, relationships)
conceptual or external schema without affecting external schemas
or applications.

Scope Pertains to the physical implementation Relates to the conceptual organization


of data within the database. of data within the database.

Impact on Schema Changes to physical storage (e.g., Changes to logical schema (e.g., table
indexing, storage mechanisms) do not structure, relationships) do not affect
Changes affect the logical or external schemas. external schemas, applications, or
user views.

Importance for Important for optimizing database Important for ensuring flexibility in
performance, storage, and retrieval. data representation and ease of
optimization
modification, enhancing
Adaptability and maintenance.
Classification of DBMS:-
Database Management Systems (DBMS) can be classified based on various criteria, including
data model, users, purpose, and sites. Here's a classification based on these criteria:

Classification of DBMS

Based on Based on Based on no Based on


users purpose of sites Data Models

Single user General


Centralized Hierarchical
Purpose

Multi user Distributed Network


Special
Relational
Purpose

Heterogeneous Homogeneous

 Based on user:-
1. Single User: - In simple user DBMS Database is stored on simple Computer and
access only one user at a time.
2. Multi User:- The database stored in single system and access by multiple user
Simultaneously

 Based on Purpose:-
1. General Purpose:- The database which provide data for general purpose
application is called General Purpose DBMS
Example: Oracle
2. Special Purpose:- Design and built for special purpose application
Example: Airline Reservation Banks

 Based on no of sites :-
1. Centralized: Data is stored at single computer site. A centralized database
stored data single location and it maintained and modified from that location only
and usually access using and interest connection such as LAN & MAN.

PC Database PC

PC

2. Distributed Database Management System:-A Distributed


Database is basically a database i.e. not limited to one system. It is
spread over different sides.

Site Site

Communication Network

Site
Site

 In Distributed there are also two types:-

1. Heterogeneous Database :-
 In this different sites can use different schema and software
that can lead to problem in query processing.
 Different computer OS and different application are used.
They may even use different models for the database.
2. Homogenous Database:-
 In Homogenous database, all different sites stores database
identically.
 The operating system, DBMS and the DS used all the same
at all sites.

 Based on Data Models:-

A. Hierarchical DBMS: In this DBMS data records are stored in a tree


like structure. When the data have one to many relationship. The
hierarchical start with root node, connecting the entire child node to the
parent node.

College

Student Admin Teacher

Salary EMP ID
Roll No Name

Network Model: - A network model is a system where the data


elements maintain one to one relationship or many to many
relationships.

College

Library
CSE Dept.

Student
B. Relational Data Model :
In relational DBMS data stored in form of rows and columns &
together form a table.
A relational DBMS uses SQL for sorting manipulating as well as
maintaining the data.

Student Father’s Name Mother’s Name DOB Address Branch


Aman Kumar Rakesh Kumar Reena 21-10- Gurgaon BCA
2005
Vishal Phore Manoj Kavita 01-06- Delhi Btech
2003

Centralized Architecture of DBMS: - Centralized DBMS architecture


involves a single server handling data storage, processing, and
management. All operations and interactions with the database occur on
this central server, simplifying maintenance and ensuring data
consistency.
. Here are four key points to explain this architecture:

i. Single Server Control:- In a centralized architecture, all


aspects of the database management, including data storage,
processing, and control, are handled by a single central server or
a mainframe computer.

ii. Data Centralization: - All data is stored in a centralized


repository, typically on the central server. Users and applications
access and manipulate the data through this central point, which
helps maintain data consistency and reduces the complexity of
data access.

iii. Direct Communication with the Server: - Users and


applications directly communicate with the central server to
access or modify the data
iv. Limited Scalability and Potential Bottlenecks: - While
centralized architecture simplifies management, it can become a
bottleneck as the system scales or experiences high usage.

Terminal
Terminal

Centralized Architecture of DBMS

Terminal Terminal

Client server architecture of DBMS:-


Client-Server architecture in DBMS involves clients (requesters)
and servers (providers).
Clients initiate requests for data or operations, which servers
process and fulfil.
This design allows for centralized control, efficient management of
requests, scalability, and improved performance, making it a widely
used model in database systems.
They both are connected with the help of internet.
Client and server are present usually at different sites.
 There are two approaches to inclement Client Server:-

A. Two Tier Architecture:-


 In two-tier architecture, the client handles the presentation and
user interface, while the server manages the application logic
and database.
 Clients directly interact with the server for data processing and
storage, simplifying the system.
 In Two Tier architecture End User and Application Program are
placed on the Client. Database System placed on the server site.

Two-Tier Architecture Diagram:

End User

Application Program

Database System

B. Three Tier Architecture:-


 In Three Tier Architecture and intermediate layer known as
Application Server is placed between Client and database
Server.
 The Client Communicate with application layer than sent
request to database Server.
 The Application store rules and procedures to process data.
 It Checks the Client request and forward to the server.
 The database Server sends the result back to application
server and then to client.
Three-Tier Architecture Diagram:

GUI Interface

Application Program

Database

Data Models:-

Data models in computer science represent how data is stored,


accessed, and managed within information systems.
They define the structure, relationships, and constraints of data.
Common data models include the relational model, organizing
data in tables with rows and columns.
The hierarchical model organizes data in a tree-like structure,
while the network model uses a graph structure.
Object-oriented models represent data as objects with attributes
and methods.
Choosing the appropriate data model is critical for efficient data
storage, retrieval, and analysis based on the system's
requirements and objectives.
Data Models

Objects Based Physical Record Based


Models Models Models

E-R Models Unifying Models Hierarchical Model

Object Oriented Frame Memory Network Model


Models Models

Semantic Data Relational Data


Models Model

Functional Data
Models

 Record Based Data Model:-


 Record-based data models organize data into fixed-format records. Each
record contains data about an entity, represented as a collection of fields.
 These models are efficient for transaction processing and retrieval of
structured data, commonly used in relational databases and file-based
systems.

1. Hierarchical Model:-
 In Hierarchical Model data are represented by collection of records.
 In this, relationship among the data represented by links.
 In this model tree data structure is used.
 It was developed in 1960 by IBM to manage large amount of data.
 The Basic logic structure data model is of hierarchical upside-down tree.
Advantages:-
A. Simplicity
B. Data Integrity
C. Data Security
D. Data Efficiency
E. Easy Availability

Disadvantages:-
A. Complexity
B. Flexibility
C. Lack of data independence.

Employee

Contract Permanent Intern

Manager Software Engineer

2. Network Model:-
 In Network Model data are represented by collection of records.
 In this relationship among the data are represented by links.
 Graph data structures are used in this model.
 It permits a record to have more than one parent.
Advantages:-
1. Data Integrity
2. Database Standards

Disadvantages:-
1. System Complexity

Project

Project 1 Project 2

Department A Department B Department C

3. Relational Data Model:-


 It uses tables to represent the data and the relationship among
these data.
 Each table has multiple rows and column and each rows and
column have unique name.

Advantages
1. Easy To Design
2. Easy to Manage

Disadvantages
1. Easy to design can result in bad design.
UNIT-III

Entity Relationship Model


o ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
o For example-Suppose we design a school database. In this database, the student
will be an entity with attributes like address, name, id, age, etc. The address can be another
entity with attributes like city, street name, pin code, etc and there will be a relationship
between them.
o

Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.

Consider an organization as an example- manager, product, employee, department etc.can be


taken as an entity.
Entity types: are the basic building blocks for describing the structure of data. It's
a category of a particular entity in an entity set. In summary, an Entity is an object
of a Type Entity and the set of all entities is called an entity.Entities are of two
types:

1. Strong Entity – A strong entity is an entity type that has a key attribute. It
doesn't depend on other entities in the schema. A strong entity always has a
primary key, and it is represented by a single rectangle in the ER diagram.

Example – roll_number identifies each student of the organization uniquely and


hence, we can say that the student is a strong entity type.

2. Weak Entity – Weak entity type doesn’t have a key attribute and so we
cannot uniquely identify them by their attributes alone. Therefore, a foreign
key must be used in combination with its attributes to create a primary key.
They are called Weak entity types because they can’t be identified on their
own. It relies on another powerful entity for its unique identity. A
weak entity is represented by a double-outlined rectangle in ER diagrams.
The relationship between a weak entity type and a strong entity type is shown with
a double-outlined diamond instead of a single-outlined diamond. This
representation can be seen in the image given below.

Entity Sets:
Entity set: is a group of entities of similar kinds. It can
contain entities with attributes that share similar values. It's collectively a group
of entities of a similar type. For example, a car entity set, an animal entity set, a bank
account entity set, and so on.

Attribute:
Attributes are the characteristics or properties which define the entity type. In ER
diagram, the attribute is represented by an oval.

For example, here id, Name, Age, and Mobile No are the attributes that define
the entity type Student. There are five types of attributes:

1. Simple attribute: Attributes that cannot be further decomposed into sub-


attributes are called simple attributes. It's an atomic value and is also known
as the key attribute. The simple attributes are represented by an oval shape in
ER diagrams with the attribute name underlined.

2. Composite attribute: An attribute that is composed of many


other attributes and can be decomposed into simple attributes is known as
a composite attribute in DBMS. The composite attribute is represented by an
ellipse.
3. Multivalued attribute: Multivalued attributes in DBMS are attributes that
can have more than one value. The double oval is used to represent a
multivalued attribute.

4. Derived attribute: Derived attributes in DBMS are the ones that can be
derived from other attributes of an entity type. The derived attributes are
represented by a dashed oval symbol in the ER diagram.
Relationship

The concept of relationship in DBMS is used to describe the relationship between


different entities. This is denoted by the diamond or a rhombus symbol. For
example, the teacher entity type is related to the student entity type and their
relation is represented by the diamond shape.

There are four types of relationships:

1. One-to-One Relationships: When only one instance of an entity is


associated with the relationship to one instance of another entity, then it is
known as one to one relationship. For example, let us assume that a male can
marry one female and a female can marry one male. Therefore the relation is
one-to-one.
2. One-to-Many Relationships: If only one instance of the entity on the left
side of the relationship is linked to multiple instances of the entity on the
right side, then this is considered a given one-to-many relationship. For
example, a Scientist can invent many inventions, but the invention is done
by only a specific scientist.

3. Many-to-One Relationships: If only one instance of the entity on the left


side of the relationship is linked to multiple instances of the entity on the
right side, then this is considered a given one-to-many relationship. For
example, a Student enrolls for only one course, but a course can have many
students.

4. Many to Many Relationships: If multiple instances of the entity on the left are
linked by relationships to multiple instances of the entity on the right, this is
considered a many-to-one-relationship means relationship. For example, one
employee can be assigned many projects, and one project can be assigned by
many employees.

Features of ER
The features of ER Model are as follows −

 ER Diagram: ER diagrams are the diagrams that are sketched out to design
the database. They are created based on three basic
concepts: entities, attributes, and relationships between them. In ER diagram
we define the entities, their related attributes, and the relationships between
them. This helps in illustrating the logical structure of the databases.
 Database Design: The Entity-Relationship model helps the database
designers to build the database in a very simple and conceptual manner.
 Graphical Representation helps in Better Understanding: ER diagrams
are very easy and simple to understand and so the developers can easily use
them to communicate with stakeholders.
 Easy to build: The ER model is very easy to build.
 The extended E-R features: Some of the additional features of ER model
are specialization, upper and lower-
level entity sets, attribute inheritance, aggregation, and generalization.
 Integration of ER model: This model can be integrated into a common
dominant relational model and is widely used by database designers for
communicating their ideas.
 Simplicity and various applications of ER model: It provides a preview of
how all your tables should connect, and what fields are going to be on each
table, which can be used as a blueprint for implementing data in specific
software applications.
 What is Data Abstraction?
 Data Abstraction = Data abstraction refers to the process of
simplifying complex data structures or systems by focusing on the
essential aspects and hiding unnecessary details. It involves
representing data and its operations at a higher level of abstraction,
making it easier to understand and work with.
 The main purpose of this is," to Hide unnecessary details and
provide an abstract view of the data for the end user".

LEVELS OF ABSTRACTION

1.PHYSICAL LEVEL:

 The physical or internal layer is the lowest level of data abstraction


in the database management system. It is the layer that defines
how data is actually stored in the database.It defines methods to
access the data in the database. It defines complex data structures
in detail, so it is very
complex to understand, which is why it is kept hidden fromthe end
user.

2. LOGICAL LEVEL:

 The logical or conceptual level is the intermediate or next level of


data abstraction. This level represents the overall logical structure
and organization of the entire database. It explains what data is
going to be stored in the database and what is the relationship
between entities, their attributes, andthe constraints that govern
them.

3. EXTERNAL LEVEL:

 View or External Level is the highest level of data abstractionthat


focuses on the specific needs and requirements of individual users
or groups of users. It represents the portion of the database that is
relevant to a particular user's perspective. At this level, users
define their own customized views of the data by specifying the
desired queries. External level allows for data independence and
provides a personalized view of the database for different users.

BASIC CONCEPT OF HEIRARCHIAL MODEL:

A hierarchical model represents the data in a tree-like structure in


which there is a single parent for each record. To maintain order there
is a sort field which keeps sibling nodes into a recorded manner. These
typesof models are designed basically for the early mainframedatabase
management systems, like the Information
Management System (IMS) by IBM.

Data in this type of database is structured hierarchically and is typically


developed as an inverted tree. The "root"
in the structure is a single table in the database and other tables act as
the branches flowing from the root. The diagram below shows a typical
hierarchical databasestructure.

NETWORK MODEL:

The network model was created to represent complex data relationships more
effectively when compared to hierarchical models, to improve database
performance and standards.

It has entities which are organized in a graphical representation and some entities
are accessed through several paths. A User perceives the network model as a
collection of records in 1:M relationships.

Given below is the pictorial representation of the network model inDBMS −

RELATIONAL DATA MODEL:


Relational data model is the primary data model, which is used widely around the
world for data storage and processing. This modelis simple and it has all the properties
and capabilities required to process data with storage efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables. This
format stores the relation among entities. A table has rows and columns, where rows
represents records and columnsrepresent the attributes.

Tuple − A single row of a table, which contains a single record forthat relation is
called a tuple.

Relation instance − A finite set of tuples in the relational database system represents
relation instance. Relation instances do not haveduplicate tuples.

Relation schema − A relation schema describes the relation name(table name),


attributes, and their names.

Relation key − Each row has one or more attributes, known as relation key,
which can identify the row in the relation (table)uniquely.

Attribute domain − Every attribute has some pre-defined value scope, known
as attribute domain.

Constraints
Every relation has some conditions that must hold for it to be avalid relation. These
conditions are called Relational Integrity Constraints. There are three main
integrity constraints −

Key constraints
Domain constraints
Referential integrity constraints
Key Constraints

There must be at least one minimal subset of attributes in the relation, which can
identify a tuple uniquely. This minimal subset ofattributes is called key for that
relation. If there are more than one such minimal subsets, these are called candidate
keys.
Key constraints force that −

in a relation with a key attribute, no two tuples can haveidentical values


for key attributes.
a key attribute can not have NULL values.

Key constraints are also referred to as Entity Constraints.

Domain Constraints

Attributes have specific values in real-world scenario. For example,age can only be a
positive integer. The same constraints have beentried to employ on the attributes of a
relation. Every attribute is bound to have a specific range of values. For example, age
cannot be less than zero and telephone numbers cannot contain a digit outside 0-9.

Referential integrity Constraints

Referential integrity constraints work on the concept of ForeignKeys. A foreign key


is a key attribute of a relation that can be referred in other relation.

Referential integrity constraint states that if a relation refers to a key attribute of a


different or same relation, then that key elementmust exist.

HISTORY OF RELATIONAL DATA MODEL:


In June of 1970, a computer scientist from IBM named Edgar F. Codd published
an academic paper titled, A Relational Model of Data for Large Shared Banks.
That paper introduced a new way to model data. It elaborated a way of building a
bunch of cross-linked tables that would allow you to store any piece of data just
once. A database with this structure could answer any question, so long as the
answer was stored somewhere in it. Disk space would be used efficiently, at a time
when storage was expensive. This paper launched databases into the future.
Oracle brought the first commercial relational database to market in 1979 followed
by DB2, SAP Sysbase ASE, and Informix.

In the 1980s and ’90s, relational databases grew increasingly dominant, delivering
rich indexes to make any query efficient. Table joins, the term for read operations
that pull together separate records into one, and Transactions, which means a
combination of reads and especially writes spread across the database, were
essential. SQL, the Structured Query Language, became the language of data, and
software developers learned to use it to ask for what they wanted, and let the
database decide how to deliver it. Strict guarantees were engineered into the
database to prevent surprises.

Important Terminologies
Here are some Relational Model concepts in DBMS:

Attribute: It refers to every column present in a table. The attributes refer to


the properties that help us define a relation. E.g., Employee_ID,
Student_Rollno, SECTION, NAME, etc.
Tuple – It is a single row of a table that consists of a single record. The
relation above consists of four tuples, one of which is like:
C1 RIYA DELHI 15 20

Tables – In the case of the relational model, all relations are saved in the
table format, and it is stored along with the entities. A table consists of two
properties: columns and rows. While rows represent records, the columns
represent attributes.
Degree: It refers to the total number of attributes that are there in the
relation. The EMPLOYEE relation defined here has degree 5.
Relation Schema: It represents the relation’s name along with its attributes.
E.g., EMPLOYEE (ID_NO, NAME, ADDRESS, ROLL_NO, AGE) is the
relation schema for EMPLOYEE. If a schema has more than 1 relation, then
it is known as Relational Schema.
Column: It represents the set of values for a certain attribute. The column
ID_NO is extracted from the relation EMPLOYEE.
Cardinality: It refers to the total number of rows present in the given table.
The EMPLOYEE relation defined here has cardinality 4.
Relation instance – It refers to a finite set of tuples present in the RDBMS
system. A relation instance never has duplicate tuples.
Attribute domain – Every attribute has some predefined value and scope,
which is known as the attribute domain.
Relation key – Each and every row consists of a single or multiple attributes.
It is known as a relation key.
NULL Values: The value that is NOT known or the value that is unavailable
is known as a NULL value. This null value is represented by the blank
spaces. E.g., the MOBILE of the EMPLOYEE having ID_NO 4 is NULL.
Relational database structure

Last Updated: 2021-03-05

The database and the database structure are defined in the installation process.
The structure of the database depends on whether the databaseis Oracle Database,
IBM® DB2®, or Microsoft SQL Server.

A database that can be perceived as a set of tables and manipulated in accordance


with the relational model of data. Each database includes:

a set of system catalog tables that describe the logical and physical
structure of the data
a configuration file containing the parameter values allocated for the
database
a recovery log with ongoing transactions and archivable transactions

DATABASE RELATION:
A relational database collects different types of data sets that use tables, records,
and columns. It is used to create a well-defined relationship between database tables so
that relational databases can be easily stored. For example of relational databases
such as Microsoft SQL Server, Oracle Database,MYSQL, etc.

There are some important parameters of the relational database:

o It is based on a relational model (Data in tables).


o Each row in the table with a unique id, key.
o Columns of the table hold attributes of data.

Employee table (Relation / Table Name)

EmpID EmpName EmpAge CountryName

Emp 101 Andrew Mathew 24 USA

Emp 102 Marcus dugles 27 England

Emp 103 Engidi Nathem 28 France

Emp 104 Jason Quilt 21 Japan

Emp 108 Robert 29 Italy

Following are the different types of relational database tables.

1. One to One relationship


2. One to many or many to one relationship
3. Many to many relationships

One to One Relationship (1:1): It is used to create a relationship between two tables in which a
single row of the first table can only be related to one and only one records of a second table.
Similarly, the row of a second table can also be related to anyone row of the first table.

Following is the example to show a relational database, as shown below.


One to Many Relationship: It is used to create a relationship between two tables. Any single
rows of the first table can be related to one or more rows of the second tables, but the rows of
second tables can only relate to the only row in the first table. It is also known as a many to one
relationship.

Representation of One to Many relational databases:

Representation of many to one relational database


Many to Many Relationship: It is many to many relationships that create a
relationship between two tables. Each record of the first table can relate to any
records (or no records) in the second table. Similarly, each record of the second
table can also relate to more than one record of the first table. It is also represented
an N:N relationship.

For example, there are many people involved in each project, and every person
can involve more than one project.

Properties of relations:
1. Each table in a database has a unique identity (name).
2. Any entry at the connection of each row and column has a single value.
There can be only one value that is related with each attributeon a specific
row of a table; no multivalued attributes are allowed in a relation.
3. Each row is unique; no two rows or tables in the same relation canbe
identical.
4. Each attribute (or column) within a table has a unique name.
5. The sequence of columns (left to right) is insignificant. The order ofthe
columns in a relation can be changed without changing the meaning or use
of the relation.
6. The sequence of rows (top to bottom) is insignificant. As with
columns, the order of the rows of a relation may be changed orstored
in any sequence.
 KEYS:
Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table.It is
also used to establish and identify relationships between tables.

For example, ID is used as a key in the Student table because it is unique for each
student. In the PERSON table, passport_number, license_number, SSN are keys
since they are unique for each person.

Types of keys:

1. Primary key
o It is the first key used to identify one and only one instance of an entity
uniquely. An entity can contain multiple keys, as we saw in the PERSON
table. The key which is most suitable from those lists becomes a primary
key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for
each employee. In the EMPLOYEE table, we can even select
License_Number and Passport_Number as primary keys since they are also
unique.
o For each entity, the primary key selection is based on requirements and
developers.

2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify
a tuple.
o Except for the primary key, the remaining attributes are considered a
candidate key. The candidate keys are as strong as the primary key.

For example: In the EMPLOYEE table, id is best suited for the primary key. The
rest of the attributes, like SSN, Passport_Number, License_Number, etc., are
considered a candidate key.
3. Super Key

Super key is an attribute set that can uniquely identify a tuple. A super key is a
superset of a candidate key.

For example: In the above EMPLOYEE table, for(EMPLOEE_ID,


EMPLOYEE_NAME), the name of two employees can be the same, but their
EMPLYEE_ID can't be the same. Hence, this combination can also be a key.

The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-


NAME), etc.
4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of
another table.
o Every employee works in a specific department in a company, and employee
and department are two different entities. So we can't store the department's
information in the employee table. That's why we link these two tables
through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a
new attribute in the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the
tables are related.

5. Alternate key

There may be one or more attributes or a combination of attributes that uniquely


identify each tuple in a relation. These attributes or combinations of the attributes
are called the candidate keys. One key is chosen as the primary key from these
candidate keys, and the remaining candidate key, if it exists, is termed the alternate
key. In other words, the total number of the alternate keys is the total number of
candidate keys minus the primary key. The alternate key may or may not exist. If
there is only one candidate key ina relation, it does not have an alternate key.
For example, employee relation has two attributes, Employee_Id and PAN_No,
that act as candidate keys. In this relation, Employee_Id is chosen as the primary
key, so the other candidate key, PAN_No, acts as the Alternatekey.

6. Composite key

Whenever a primary key consists of more than one attribute, it is known as a


composite key. This key is also known as Concatenated Key.

For example, in employee relations, we assume that an employee may be assigned


multiple roles, and an employee may work on multiple projects simultaneously. So
the primary key will be composed of all three attributes, namely Emp_ID,
Emp_role, and Proj_ID in combination. So these attributes
act as a composite key since the primary key comprises more than one attribute.

7. Artificial key

The key created using arbitrarily assigned data are known as artificial keys. These
keys are created when a primary key is large and complex and has no relationship
with many other relations. The data values of the artificial keys are usually
numbered in a serial order.

For example, the primary key, which is composed of Emp_ID, Emp_role, and
Proj_ID, is large in employee relations. So it would be better to add a new virtual
attribute to identify each tuple in the relation uniquely.

DOMAIN:

A domain is a unique set of values that can be assigned to an attribute in a


database. For example, a domain of strings can accept only string values.

Introduction to Database Domain

The data type defined for a column in a database is called a database domain. This
data type can either be a built-in type (such as an integer or a string) or a custom
type that defines data constraints.
To understand this more effectively, let's think like this :
A database schema has a set of attributes, also called columns or fields, that define
the database. Each attribute has a domain that specifies the types of values that can
be used, as well as other information such as data type, length, and values.

Creating a Domain:

To create a domain, we use the CREATE DOMAIN command in SQL.

Let's look at the syntax :

CREATE DOMAIN C_Number INT(10) NOT NULL;

The above statement, for example, creates a C_Number attribute with ten integers
to store contact numbers. It's not possible to use a NULL or an unknown value.
This generates a ten-integer C_Number property, in which it wouldn't be possible
to use a NULL or an unknown value.

 INTEGRITY COSTRAINTS OVER RELATION:


For any stored data if we want to preserve the consistency and correctness, a
relational DBMS typically imposes one or more data integrity constraints. These
constraints restrict the data values which can be inserted into the database or
created by a databaseupdate.

Data Integrity Constraints

There are different types of data integrity constraints that are commonly found in
relational databases, including the following −

 Required data − Some columns in a database contain a valid data value in


each row; they are not allowed to contain NULL values. In the sample
database, every order has an associated customer who placed the order. The
DBMS can be asked to prevent NULL values in this column.
 Validity checking − Every column in a database has a domain, a set of data
values which are legal for that column. The DBMS allowed preventing other
data values in these columns.
 Entity integrity − The primary key of a table contains a unique value in
each row that is different from the values in all other rows. Duplicate values
are illegal because they are not allowing the database to differentiate one
entity from another. The DBMS can be asked to enforce this unique values
constraint.
 Referential integrity − A foreign key in a relational database links each row
in the child table containing the foreign key to the row of the parent table
containing the matching primary key value. The DBMS can be asked to
enforce this foreign key/primary key constraint.
 Other data relationships − The real-world situation which is modeled by a
database often has additional constraints which govern the legal data values that
may appear in the database. The DBMS is allowed to check modifications to
the tables to make sure that their values are constrained in this way.
 Business rules − Updates to a database that are constrained by business rules
governing the real-world transactions which are represented by the updates.
 Consistency − Many real-world transactions that cause multiple updates to a
database. The DBMS is allowed to enforce this type of consistency rule or to
support applications that implement such rules.
UNIT- IV

Relational Algebra
Relational algebra refers to a procedural query language that takes relation instances as input and
returns relation instances as output. It performs queries with the help of operators. A binary or
unary operator can be used. They take in relations as input and produce relations as output.
Types of Relational Operations
1. Select (σ)
2. Project (∏)
3. Union (𝖴)
4. Set Difference (-)
5. Set Intersection (∩)
6. Cartesian product (X)
7. Rename (ρ)

1. Select (σ)

Select operation is done by Selection Operator which is represented by "sigma"(σ). It is used to


retrieve tuples(rows) from the table where the given condition is satisfied. It is a unary operator
means it requires only one operand.
Notation : σ p(R)
Where σ is used to represent SELECTION
R is used to represent RELATION
p is the logic formula

Syntax of select Operator(σ)


Select Operator (σ) Example:-

Query:

Output:

2. Project (∏)
Project operation is done by Projection Operator which is represented by "pi"(∏). It is used to
retrieve certain attributes(columns) from the table. It is also known as vertical partitioning as it
separates the table vertically. It is also a unary operator.
Notation : ∏ a(r)
Where ∏ is used to represent PROJECTION
r is used to represent RELATION
a is the attribute list
Syntax of Project Operator (∏)

Project Operator (∏) Example:-


We have a table CUSTOMER with three columns, we want to fetch only two columns of the
table, which we can do with the help of Project Operator ∏.
Query:

Output:

3. Union (𝖴)
Union operation is done by Union Operator which is represented by "union"(𝖴). It is the same as
the union operator from set theory, i.e., it selects all tuples from both relations but with the
exception that for the union of two relations/tables both relations must have the same set of
Attributes. It is a binary operator as it requires two operands.
Notation: R 𝖴 S
Where R is the first relation
S is the second relation
If relations don't have the same set of attributes, then the union of such relations will result in NULL.
Syntax of Union Operator (𝖴)

Union Operator(𝖴) Example:-


Table 1: COURSE
Table 2: STUDENT

Query:

Output:

4. Set Difference (-)


Set Difference as its name indicates is the difference between two relations (R-S). It is denoted
by a "Hyphen"(-) and it returns all the tuples(rows) which are in relation R but not in relation S.
It is also a binary operator.
Notation : R - S
Where R is the first relation
S is the second relation
Syntax of Set Difference (-)

Set Difference (-) Example:-


Let take the same tables COURSE and STUDENT that we have seen above.
Query:
write a query to select those student names that are present in STUDENT table but not
present in COURSE table.
Output:

5. Intersection (∩)
Intersection operation is done by Intersection Operator which is represented by
"intersection"(∩).It is the same as the intersection operator from set theory, i.e., it selects all the
tuples which are present in both relations. It is a binary operator as it requires two operands.
Also, it eliminates duplicates.
Notation : R ∩ S
Where R is the first relation
S is the second relation
Syntax of Intersection Operator (∩)

Intersection Operator (∩) Example:-


Let take the same example that we have taken above.
Table 1: COURSE

Table 2: STUDENT
Query:

Output:

6. Cartesian product (X)


Cartesian product is denoted by the "X" symbol. Let's say we have two relations R and S.
Cartesian product will combine every tuple(row) from R with all the tuples from S. It is also a
binary operator.
Notation: R X S
Where R is the first relation
S is the second relation
Syntax of Cartesian product (X)

Cartesian product (X) Example:-


Table 1: R

Table 2:S
Query:
Let find the cartesian product of table R and S

Output:

Note: The number of rows in the output will always be the cross product of number of rows
in each table. In our example table 1 has 3 rows and table 2 has 3 rows so the output has
3×3 = 9 rows.

7. Rename (ρ)
Rename operation is denoted by "Rho"(ρ). As its name suggests it is used to rename the output
relation. Rename operator too is a binary operator.
Notation: ρ(R,S)
Where R is a new relation name
S is a old relation name
Rename (ρ) Syntax:

Rename (ρ) Example:-


Let say we have a table of customer we are fetching customer names and we are renaming
the resulted relation to CUST_NAMES.
Table : CUSTOMER

Query:

Output:

Relational Calculus
Relational Calculus in non-procedural query language and has no description about how the
query will work or the data will be fetched. It only focusses on what to do, and not on how to do
it.
Types of Relation calculus

1. Tuple Relational Calculus (TRC)


2. Domain Relational Calculus (DRC)

1. Tuple Relational Calculus (TRC) :- Tuple Relational Calculus in DBMS uses a tuple
variable (t) that goes to each row of the table and checks if the predicate is true or false for the
given row. Depending on the given predicate condition, it returns the row or part of the row.
Syntax
{T | P (T)} or {T | Condition (T)}

Example
Table: Student
First_Name Last_Name Age

Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28

Query:
{ t.Last_Name | Student(t) AND t.age > 30 }

Output:
Last_Name

Singh
2. Domain Relational Calculus (DRC) :- Domain Relational Calculus uses domain
Variables to get the column values required from the database based on the predicate expression
or condition.

Syntax
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}

Example:-
First_Name Last_Name Age

Ajeet Singh 30
Chaitanya Singh 31
Rajeev Bhatia 27
Carl Pratap 28
Query
{< First_Name, Age > | ∈ Student 𝖠 Age > 27}

Note:
The symbols used for logical operators are: 𝖠 for AND, ∨ for OR and ┓ for NOT.

Output
First_Name Age

Ajeet 30
Chaitanya 31
Carl 28

Relation Database Design


Relational database design (RDD) models information and data into a set of tables with rows and
columns. Each row of a relation/table represents a record, and each column represents an
attribute of data. The Structured Query Language (SQL) is used to manipulate relational
databases. The design of a relational database is composed of four stages, where the data are
modeled into a set of related tables. The stages are:
 Relations and attributes: The various tables and attributes related to each table are
identified. The tables represent entities, and the attributes represent the properties of the
respective entities.
 Primary keys: The attribute or set of attributes that help in uniquely identifying a record
is identified and assigned as the primary key
 Relationships: The relationships between the various tables are established with the help
of foreign keys. Foreign keys are attributes occurring in a table that are primary keys of
another table. The types of relationships that can exist between the relations (tables) are:
 One to one
 One to many
 Many to many
An entity-relationship diagram can be used to depict the entities, their attributes and the
relationship between the entities in a diagrammatic way.
 Normalization: This is the process of optimizing the database structure. Normalization
simplifies the database design to avoid redundancy and confusion. The different normal
forms are as follows:
 First normal form
 Second normal form
 Third normal form
 Boyce-Codd normal form
 Fifth normal form

Functional Dependencies
In a relational database management, functional dependency is a concept that specifies the
relationship between two sets of attributes where one attribute determines the value of another
attribute. It is denoted as X → Y, where the attribute set on the left side of the arrow, X is
called Determinant, and Y is called the Dependent
Armstrong’s axioms/properties of functional dependencies:
1. Reflexivity: If Y is a subset of X, then X→Y holds by reflexivity rule
Example, {roll_no, name} → name is valid.
2. Augmentation: If X → Y is a valid dependency, then XZ → YZ is also valid by the
augmentation rule.
Example, {roll_no, name} → dept_building is valid, hence {roll_no, name, dept_name}
→ {dept_building, dept_name} is also valid.
3. Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also valid
by the Transitivity rule.
Example, roll_no → dept_name & dept_name → dept_building, then roll_no →
dept_building is also valid.
Types of Functional Dependencies in DBMS
1. Trivial functional dependency
2. Non-Trivial functional dependency
3. Multivalued functional dependency
4. Transitive functional dependency

1. Trivial Functional Dependency


In Trivial Functional Dependency, a dependent is always a subset of the determinant.
i.e. If X → Y and Y is the subset of X, then it is called trivial functional dependency
Example:

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, {roll_no, name} → name is a trivial functional dependency, since the


dependent name is a subset of determinant set {roll_no, name}. Similarly, roll_no →
roll_no is also an example of trivial functional dependency.
2. Non-trivial Functional Dependency
In Non-trivial functional dependency, the dependent is strictly not a subset of the
determinant. i.e. If X → Y and Y is not a subset of X, then it is called Non-trivial
functional dependency.
Example:
roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, roll_no → name is a non-trivial functional dependency, since the


dependent name is not a subset of determinant roll_no. Similarly, {roll_no, name} →
age is also a non-trivial functional dependency, since age is not a subset of {roll_no,
name}
3. Multivalued Functional Dependency
In Multivalued functional dependency, entities of the dependent set are not
dependent on each other. i.e. If a → {b, c} and there exists no functional
dependency between b and c, then it is called a multivalued functional dependency.
For example,

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, roll_no → {name, age} is a multivalued functional dependency, since the


dependents name & age are not dependent on each other(i.e. name → age or age →
name doesn’t exist !)
4. Transitive Functional Dependency
In transitive functional dependency, dependent is indirectly dependent on determinant. i.e. If a →
b & b → c, then according to axiom of transitivity, a → c. This is a transitive functional
dependency.
For example,

enrol_no name dept building_no

42 abc CO 4

43 pqr EC 2

44 xyz IT 1

45 abc EC 2

Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect
functional dependency, hence called Transitive functional dependency.
Advantages of Functional Dependencies
1. Data Normalization:- Data normalization is the process of organizing data in a database
in order to minimize redundancy and increase data integrity.
2. Query Optimization:- With the help of functional dependencies we are able to decide
the connectivity between the tables and the necessary attributes need to be projected to
retrieve the required data from the tables.
3. Consistency of Data:- Functional dependencies ensures the consistency of the data
by removing any redundancies or inconsistencies that may exist in the data.
4. Data Quality Improvement:- Functional dependencies ensure that the data in the
database to be accurate, complete and updated.

Anomalies
Anomalies means problems or inconsistency which happened during the operations performed
on the table. There can be many reasons that anomaly occur for example,It occurs when data is
stored multiple times unnecessarily in the database i.e. redundant data is present or it occur when
all the data is stored in a single table. normalization is used to overcome the anomalies. the
different type of anomalies are insertion,deletion and updation anomaly.
Type of Anomalies
1. Update
2. Insert
3. Delete

Input
The same input is used for all three anomalies.

Student

ID Name Age Branch Branch_Code Hod_name

1 A 17 Civil 101 Aman

2 F 22 Electrical 103 Rakesh

1. Insertion Anomaly:-
When certain data or attributes cannot be inserted into the database without the presence of other
data, it's called insertion anomaly.
For example, let's take a branch name petroleum, now the data regarding petroleum cannot be
stored in the table unless we insert a student which is in petroleum.

Code
Insert into student values(3, ‘G’,16, ‘PETROLEUM’,104, ‘NAMAN’)
Select * from Student;

Output

ID Name Age Branch Branch_Code Hod_name

1 A 17 Civil 101 Aman

2 F 22 Electrical 103 Rakesh

3 G 16 Petroleum 104 Naman

2. Deletion anomaly:-
If we delete any data from the database and any other information which is required also gets
deleted with that deletion, then it is called deletion anomaly.
For example, suppose a student of the electrical branch is leaving so now we have to delete the
data of that student, but the problem is if we delete the student data, then branch data will also
get deleted along with that as there is only one student present through which branch data is
present.
Code
Delete from STUDENT WHERE BRANCH= ‘ELECTRICAL’;
Select * from STUDENT;
Output

ID Name Age Branch Branch_Code Hod_name

1 A 17 Civil 101 Aman

3. Updation/modification anomaly:-
If we want to update any single piece of data then we have to update all other copies, it comes
under insertion anomaly.
For example, suppose we need to change the hod name for civil branch, now as per requirement,
only single data is to be changed, but we have to change the data at every other part so as to not
make an inconsistent table.
Code:-
Update STUDENT #Table selected to preform task
Set HOD_NAME= ‘RAHUL’#changes to be made
WHERE BRANCH= ‘CIVIL’;#condition given
Select * from STUDENT;#Data selected
Output:-

ID Name Age Branch Branch_Code Hod_name

1 A 17 Civil 101 RAHUL

2 F 22 Electrical 103 Rakesh


Normalization
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations. It is
also used to eliminate undesirable characteristics like Insertion, Update, and Deletion
Anomalies.
o Normalization divides the larger table into smaller and links them using relationships.
o The normal form is used to reduce redundancy from the database table.
Advantages of Normalization
o Normalization helps to minimize data redundancy.
o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization
o You cannot start building the database before knowing what the user needs.
o The performance degrades when normalizing the relations to higher normal forms, i.e.,
4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of a higher degree.
o Careless decomposition may lead to a bad database design, leading to serious problems.

1. The First Normal Form – 1NF


For a table to be in the first normal form, it must meet the following criteria:
 a single cell must not hold more than one value (atomicity)
 there must be a primary key for identification
 no duplicated rows or columns
 each column must have only one value for each row in the table
Example 1–
ID Name Courses
1 A c1, c2
2 E c3
3 M C2, c3
In the above table Course is a multi-valued attribute so it is not in 1NF. Below Table
is in 1NF as there is no multi-valued attribute
ID Name Courses
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3

2. The Second Normal Form – 2NF


The 1NF only eliminates repeating groups, not redundancy. That’s why there is 2NF.
A table is said to be in 2NF if it meets the following criteria:
 it’s already in 1NF
 has no partial dependency. That is, all non-key attributes are fully dependent on a primary
key.
If a partial dependency exists, we can divide the table to remove the partially dependent
attributes and move them to some other table where they fit in well.
Example:-

EMPLOYEE_ID NAME JOB_CODE JOB STATE_CODE HOME_STATE


E001 Alice J01 Chef 26 Michigan

E001 Alice J02 Waiter 26 Michigan


E002 Bob J02 Waiter 56 Wyoming

E002 Bob J03 Bartender 56 Wyoming


E003 Alice J01 Chef 56 Wyoming
employee_roles Table
EMPLOYEE_ID JOB_CODE

E001 J01

E001 J02

E002 J02

E002 J03

E003 J01

employees Table
EMPLOYEE_ID NAME STATE_CODE HOME_STATE

E001 Alice 26 Michigan

E002 Bob 56 Wyoming

E003 Alice 56 Wyoming

jobs table
JOB_CODE JOB

J01 Chef

J02 Waiter

J03 Bartender

home_state is now dependent on state_code. So, if you know the state_code, then you can find
the home_state value.

To take this a step further, we should separate them again to a different table to make it 3NF.
3.The Third Normal Form – 3NF
When a table is in 2NF, it eliminates repeating groups and redundancy, but it does not eliminate
transitive partial dependency.

This means a non-prime attribute (an attribute that is not part of the candidate’s key) is
dependent on another non-prime attribute. This is what the third normal form (3NF) eliminates.

So, for a table to be in 3NF, it must:

 be in 2NF
 have no transitive partial dependency.

Example of Third Normal Form (3NF)

employee_roles Table

EMPLOYEE_ID JOB_CODE
E001 J01
E001 J02
E002 J02
E002 J03
E003 J01

employees Table

EMPLOYEE_ID NAME STATE_CODE


E001 Alice 26
E002 Bob 56
E003 Alice 56

jobs Table:-
JOB_CODE JOB
J01 Chef
J02 Waiter
J03 Bartende
states Table

STATE_CODE HOME_STATE
26 Michigan
56 Wyoming

4. Boyce-Codd Normal Form (BCNF)


 Boyce and Codd Normal Form is a higher version of the Third Normal Form.
 This form deals with a certain type of anomaly that is not handled by 3NF.
 A 3NF table that does not have multiple overlapping candidate keys is said to be in
BCNF.
 For a table to be in BCNF, the following conditions must be satisfied:
o R must be in the 3rd Normal Form
o and, for each functional dependency ( X → Y ), X should be a Super Key.

Example: Let's assume there is a company where employees work in more than one department.

EMPLOYEE table:

EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549

The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
To convert the given table into BCNF, we decompose it into three tables:
EMP_COUNTRY table:

EMP_ID EMP_COUNTRY

264 India

264 India

EMP_DEPT table:

EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283

Testing D394 300

Stores D283 232

Developing D283 549

EMP_DEPT_MAPPING table:

EMP_ID EMP_DEPT

D394 283

D394 300

D283 232

D283 549

Functional dependencies:
1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
4. Fourth Normal Form (4NF)
A table is said to be in the Fourth Normal Form when,
1.It is in the Boyce-Codd Normal Form.
2. And, it doesn't have Multi-Valued Dependency.

3. It must not contain more than one multivalued dependency.

Example where a table is used to store the Roll Numbers and Names of the students enrolled in
a university.

ROLL_NO STUDENT
901 Armaan

902 Ashutosh

903 Baljeet
904 Bhupinder

Let's check for BCNF first :


The Candidate key is ROLL_NO, and the prime attribute is also ROLL_NO
The above table has a single value for each attribute, the non-key attribute STUDENT is fully
dependent on the primary key, there is no transitive dependency for the non-key
attribute STUDENT, and for ROLL_NO −>−> STUDENT, ROLL_NO is the super key of the
table. Therefore the above table is in BCNF.
Now let's check for Multi-Valued Dependency :
Since there are only two columns there is not any multi-valued dependency in the above table
hence the above table is in 4NF.
ANOTHER EXAMPLE:-
Example
STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing
34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE

STU_ID COURSE

21 Computer

21 Math

34 Chemistry

74 Biology

59 Physics

STUDENT_HOBBY

STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing
74 Cricket

59 Hockey

5. Fifth Normal Form (5NF)


 The fifth normal form is also called the PJNF - Project-Join Normal Form
 It is the most advanced level of Database Normalization.
 Using Fifth Normal Form you can fix Join dependency and reduce data redundancy.
 It also helps in fixing Update anomalies in DBMS design.

Example – Consider the above schema, with a case as “if a company makes a product and an
agent is an agent for that company, then he always sells that product for the company”. Under
these circumstances, the ACP table is shown as:

Table ACP

Agent Company Product

A1 PQR Nut

A1 PQR Bolt

A1 XYZ Nut
Agent Company Product

A1 XYZ Bolt

A2 PQR Nut

The relation ACP is again decomposed into 3 relations. Now, the natural Join of all three
relations will be shown as:
Table R1

Agent Company

A1 PQR

A1 XYZ

A2 PQR

Table R2

Agent Product

A1 Nut

A1 Bolt

A2 Nut

Table R3
Company Product

PQR Nut

PQR Bolt

XYZ Nut

XYZ Bolt

The result of the Natural Join of R1 and R3 over ‘Company’ and then the Natural Join of R13
and R2 over ‘Agent’and ‘Product’ will be Table ACP.

SQL
Data type: Data types are used to represent the nature of the data that can be stored in the
database table. For example, in a particular column of a table, if we want to store a string type of
data then we will have to declare a string data type of this column.

Data types mainly classified into three categories for every database.

o String Data types


o Numeric Data types
o Date and time Data types

SQL Server String Data Type

DATA TYPE DESCRIPTION & SIZE STORAGE

char(n) It is a fixed width character string data type. Its size can Define width
be up to 8000 characters.

varchar(n) It is a variable width character string data type. Its size 2 bytes + number of
can be up to 8000 characters. chars
varchar(max) It is a variable width character string data types. Its size 2 bytes + number of
can be up to 1,073,741,824 characters. chars

text It is a variable width character string data type. Its size 4 bytes + number of
can be up to 2GB of text data. chars

nchar It is a fixed width Unicode string data type. Its size can Define width x 2
be up to 4000 characters.

nvarchar It is a variable width Unicode string data type. Its size


can be up to 4000 characters.

ntext It is a variable width Unicode string data type. Its size


can be up to 2GB of text data.

binary(n) It is a fixed width Binary string data type. Its size can
be up to 8000 bytes.

varbinary It is a variable width Binary string data type. Its size can
be up to 8000 bytes.

image It is also a variable width Binary string data type. Its


size can be up to 2GB.

SQL Server Numeric Data Types

DATA DESCRIPTION STORAGE


TYPE

bit It is an integer that can be 0, 1 or null.

tinyint It allows whole numbers from 0 to 255. 1 byte

Smallint It allows whole numbers between -32,768 and 32,767. 2 bytes

Int It allows whole numbers between -2,147,483,648 and 2,147,483,647. 4 bytes


bigint It allows whole numbers between -9,223,372,036,854,775,808 and 8 bytes
9,223,372,036,854,775,807.

float(n) It is used to specify floating precision number data from -1.79E+308 to 4 or 8 bytes
1.79E+308. The n parameter indicates whether the field should hold the 4
or 8 bytes. Default value of n is 53.

real It is a floating precision number data from -3.40E+38 to 3.40E+38. 4 bytes

money It is used to specify monetary data from -922,337,233,685,477.5808 to 8 bytes


922,337,203,685,477.5807.

SQL Server Date and Time Data Type

DATA DESCRIPTION STORAGE


TYPE

datetime It is used to specify date and time combination. It supports 8 bytes


range from January 1, 1753, to December 31, 9999 with
an accuracy of 3.33 milliseconds.

datetime2 It is used to specify date and time combination. It supports 6-8 bytes
range from January 1, 0001 to December 31, 9999 with
an accuracy of 100 nanoseconds

date It is used to store date only. It supports range from January 3 bytes
1, 0001 to December 31, 9999

time It stores time only to an accuracy of 100 nanoseconds 3-5 bytes

timestamp It stores a unique number when a new row gets created or


modified. The time stamp value is based upon an internal
clock and does not correspond to real time. Each table
may contain only one-time stamp variable.

Basic Queries in SQL:-


SQL UPDATE Statement
The UPDATE statement is used to modify the existing records in a table.

UPDATE Syntax
UPDATE table_name
SET column1 = value1, column2 = value2, ...
WHERE condition;

Ex- Be careful when updating records. If you omit the WHERE clause, ALL records will be
updated!
CustomerID CustomerName ContactName Address City PostalCode Country

1 Alfreds Futterkiste Maria Anders Obere Str. 57 Berlin 12209 Germany

2 Ana Trujillo Ana Trujillo Avda. de la México 05021 Mexico


Emparedados y Constitución D.F.
helados 2222

The following SQL statement updates the first customer (CustomerID = 1) with a new contact
person and a new city.
Code:- UPDATE Customers
SET ContactName = 'Alfred Schmidt', City= 'Frankfurt'
WHERE CustomerID = 1;

CustomerID CustomerName ContactName Address City PostalCode Country

1 Alfreds Futterkiste Alfred Obere Str. 57 Frankfut 12209 Germany


Schmidt

2 Ana Trujillo Ana Trujillo Avda. de la México 05021 Mexico


Emparedados y Constitución D.F.
helados 2222

SQL DELETE Statement


The DELETE statement is used to delete existing records in a table.
DELETE Syntax

DELETE FROM table_name WHERE condition;


DELETE Example
The following SQL statement deletes the customer "Alfreds Futterkiste" from the "Customers"
table:
Code:-
DELETE FROM Customers WHERE CustomerName='Alfreds
Futterkiste';

CustomerID CustomerName ContactName Address City PostalCode Country

2 Ana Trujillo Ana Trujillo Avda. de la México 05021 Mexico


Emparedados y Constitución D.F.
helados 2222

INSERT INTO Statement


The INSERT INTO statement is used to insert new records in a table.
INSERT INTO Syntax

INSERT INTO table_name (column1, column2, column3, ...)


VALUES (value1, value2, value3, ...);

INSERT INTO Example


Code:-
INSERT INTO Customers (CustomerName, ContactName, Address, City,
PostalCode, Country)
VALUES ('Cardinal', 'Tom B. Erichsen', 'Skagen
21', 'Stavanger', '4006', 'Norway');

Table:-
CustomerID CustomerName ContactName Address City PostalCode Country

89 White Clover Karl Jablonski 305 - 14th Seattle 98128 USA


Markets Ave. S. Suite
3B
90 Wilman Kala Matti Karttunen Keskuskatu Helsinki 21240 Finland
45
91 Wolski Zbyszek ul. Filtrowa Walla 01-012 Poland
68

92 Cardinal Tom B. Erichsen Skagen 21 Stavanger 4006 Norway

Query Processing
General Strategies of Query Processing:- Query processing is a process of translating a user
query into an executable form. It helps to retrieve the results from a database. In query
processing, it converts the high-level query into a low-level query for the database. Query
processing is a very important component of DBMS. It is critical to the performance of
applications that rely on database operations. The flow of query processing in DBMS is
mentioned below:

Steps in Query Processing:- Query processing in DBMS involves several steps


Parsing
Query parsing is the first step in query processing. In this step, a query is
checked for syntax errors. Then it converts it into the parse tree. So, a parse tree
represents the query in a format that is easy to understand for DBMS. A parse tree
is used in other steps of query processing in DBMS.

Optimization
After doing query parsing, the DBMS starts finding the most efficient way to
execute the given query. The optimization process follows some factors for the
query. These factors are indexing, joins, and other optimization mechanisms.
These help in determining the most efficient query execution plan. So, query
optimization tells the DBMS what the best execution plan is for it. The main goal
of this step is to retrieve the required data with minimal cost in terms of resources
and time.

Evaluation
After finding the best execution plan, the DBMS starts the execution of the
optimized query. And it gives the results from the database. In this step, DBMS
can perform operations on the data. These operations are selecting the data,
inserting something, updating the data, and so on.
Once everything is completed, DBMS returns the result after the evaluation step.
This result is shown to you in a suitable format.

Query Processor:-
It interprets the requests (queries) received from end user via an application
program into instructions. It also executes the user request which is received from
the DML compiler.
Query Processor contains the following components –
 DML Compiler: It processes the DML statements into low level instruction
(machine language), so that they can be executed.
 DDL Interpreter: It processes the DDL statements into a set of table containing
meta data (data about data).
 Embedded DML Pre-compiler: It processes DML statements embedded in an
application program into procedural calls.
 Query Optimizer: It executes the instruction generated by DML Compiler.

Concurrency Control
Concurrency control means that multiple transactions can be executed at the same time
and then the interleaved logs occur. But there may be changes in transaction results so
maintain the order of execution of those transactions.
During recovery, it would be very difficult for the recovery system to backtrack all the
logs and then start recovering.
Recovery with concurrent transactions can be done in the following four ways.
1. Interaction with concurrency control
2. Transaction rollback
3. Checkpoints
4. Restart recovery
Interaction with concurrency control :
In this scheme, the recovery scheme depends greatly on the concurrency control scheme
that is used. So, to rollback a failed transaction, we must undo the updates performed by
the transaction.
Transaction rollback :
 In this scheme, we rollback a failed transaction by using the log.
 The system scans the log backward a failed transaction, for every log record
found in the log the system restores the data item.
Checkpoints :
 Checkpoints is a process of saving a snapshot of the applications state so that
it can restart from that point in case of failure.
 Checkpoint is a point of time at which a record is written onto the database
form the buffers.
 Checkpoint shortens the recovery process.
 When it reaches the checkpoint, then the transaction will be updated into the
database, and till that point, the entire log file will be removed from the file.
Then the log file is updated with the new step of transaction till the next
checkpoint and so on.
 The checkpoint is used to declare the point before which the DBMS was in
the consistent state, and all the transactions were committed.
To ease this situation, ‘Checkpoints‘ Concept is used by the most DBMS.
 In this scheme, we used checkpoints to reduce the number of log records that
the system must scan when it recovers from a crash.
 In a concurrent transaction processing system, we require that the checkpoint
log record be of the form <checkpoint L>, where ‘L’ is a list of transactions
active at the time of the checkpoint.
 A fuzzy checkpoint is a checkpoint where transactions are allowed to perform
updates even while buffer blocks are being written out.
Restart recovery :
 When the system recovers from a crash, it constructs two lists.
 The undo-list consists of transactions to be undone, and the redo-list consists
of transaction to be redone.
 The system constructs the two lists as follows: Initially, they are both empty.
The system scans the log backward, examining each record, until it finds the
first <checkpoint> record.

You might also like