0% found this document useful (0 votes)
6 views

Advanced Database Management

The document outlines the curriculum for an Advanced Database Management Systems course offered by the Centre for Distance and Online Education at Parul University. It covers fundamental concepts of databases, the purpose and characteristics of Database Management Systems (DBMS), and various data models and structures. Additionally, it emphasizes the importance of data security, backup, and recovery techniques in managing databases.

Uploaded by

TYBCA 2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Advanced Database Management

The document outlines the curriculum for an Advanced Database Management Systems course offered by the Centre for Distance and Online Education at Parul University. It covers fundamental concepts of databases, the purpose and characteristics of Database Management Systems (DBMS), and various data models and structures. Additionally, it emphasizes the importance of data security, backup, and recovery techniques in managing databases.

Uploaded by

TYBCA 2021
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 274

Advanced Database Management

Systems
Centre for Distance and Online Education
Online MCA Program
Advanced Database Management System

Semester: 1

Author

Mr. Jay Parmar, Assistant Professor, Online Degree-CDOE,


Parul University

Credits
Centre for Distance and Online Education,
Parul University,

Post Limda, Waghodia,

Vadodara, Gujarat, India

391760.

Website: https://paruluniversity.ac.in/

Disclaimer

This content is protected by CDOE, Parul University. It is sold under the stipulation that it cannot be
lent, resold, hired out, or otherwise circulated without obtaining prior written consent from the
publisher. The content should remain in the same binding or cover as it was initially published, and this
requirement should also extend to any subsequent purchaser. Furthermore, it is important to note that,
in compliance with the copyright protections outlined above, no part of this publication may be
reproduced, stored in a retrieval system, or transmitted through any means (including electronic,
Mechanical, photocopying, recording, or otherwise) without obtaining the prior
written permission from both the copyright owner and the publisher of this
content.

Note to Students
These course notes are intended for the exclusive use of students enrolled in
Online MCA. They are not to be shared or distributed without explicit permission
from the University. Any unauthorized sharing or distribution of these materials
may result in academic and legal consequences.

BASIC CONCEPTS
Table of Content
SUB LESSON 1.1
DATABASE
SUB LESSON 1.2
PURPOSE OF DATABASE
SUB LESSON 1.3
ADVANTAGES OF DBMS
SUB LESSON 2.1
INTRODUCTION OF DATA MODELS
SUB LESSON 2.2
TYPES OF DATA MODELS
SUB LESSON 2.3
THREE LEVEL ARCHITECTURE
SUB LESSON 2.4
VARIOUS COMPONENTS OF DBMS
SUB LESSON 3.1
PARALLEL DATABASE
SUB LESSON 3.2
DISTRIBUTED DATABASE
SUB LESSON 3.3
OBJECT ORIENTED DATABASE
SUB LESSON 3.4
OBJECT RELATIONAL DATABASE
SUB LESSON 4.1
TABLES
SUB LESSON 4.2
ENTITY IN DBMS
SUB LESSON 4.3
ATTRIBUTES IN DBMS
SUB LESSON 4.4
TYPES OF RELATIONSHIP IN DBMS

BASIC CONCEPTS
SUB LESSON 5.1
TYPES OF KEYS
SUB LESSON 5.2
SUPER KEY AND CANDIDATE KEY
SUB LESSON 5.3
PRIMARY KEY AND FOREIGN KEY
SUB LESSON 6.1
INDEX
SUB LESSON 7.1
DATABASE DESIGN
SUB LESSON 7.2
FUNCTIONAL DEPENDENCIES
SUB LESSON 7.3
NORMALIZATION
SUB LESSON 8.1
NORMAL FORM
SUB LESSON 8.2
1NF 2NF 3NF
SUB LESSON 9.1
OVERVIEW OF SQL
SUB LESSON 9.2
BASIC AND ADVANCED QUERIES IN SQL
SUB LESSON 9.3
RELATIONAL ALGEBRA AND CALCULUS
SUB LESSON 10.1
BASIC CODE STRUCTURE
SUB LESSON 10.2
DATA TYPES
SUB LESSON 10.3
CONTROL STRUCTURE, LOOPING STRUCTURES
SUB LESSON 11.1
CREATE/REPLACE VIEWS
SUB LESSON 12.1

BASIC CONCEPTS
UNDERSTANDING THE MAIN FEATURES OF STORED PROCEDURE
SUB LESSON 12.2
STORED PROCEDURE ARCHITECTURE
SUB LESSON 12.3
ADVANTAGES OF USING STORED PROCEDURE
SUB LESSON 13.1
FUNDAMENTALS OF DATABASE TRIGGERS
SUB LESSON 14.1
TRANSACTIONAL CONTROL
SUB LESSON 14.2
TCL COMMANDS
SUB LESSON 15.1
ROW LEVEL LOCKS & TABLE LEVEL LOCKS
SUB LESSON 15.2
EXCLUSIVE LOCK AND SHARED LOCK
SUB LESSON 15.3
DEADLOCK
SUB LESSON 16.1
METHODS FOR CONCURRENCY CONTROL
SUB LESSON 16.2
LOCKING METHODS
SUB LESSON 16.3
TIMESTAMP METHODS
SUB LESSON 16.4
OPTIMISTIC METHOD
SUB LESSON 17.1
DATABASE SECURITY AND ITS ISSUES
SUB LESSON 17.2
GRANTING AND REVOKING PRIVILEGES
SUB LESSON 17.3
ROLE-BASED ACCESS CONTROL
SUB LESSON 18.1
DATABASE BACKUP AND RECOVERY CONCEPTS

BASIC CONCEPTS
SUB LESSON 18.2
DATABASE BACKUP AND RECOVERY TECHNIQUES

BASIC CONCEPTS
SUB LESSON 1.1
DATABASE
WHAT IS DATA ?

➢ Data has become such a common word that many of us have probably never thought about its
exact definition. What first pops up in our mind about data is most likely a spreadsheet, a table,
or a chart, that comprises numbers and labels. Data is a raw and unorganized fact that required
to be processed to make it meaningful.

➢ Generally, data comprises facts, observations, perceptions numbers, characters, symbols,


image, etc. Data can be defined as a representation of facts, concepts, or instructions in a
formalized manner, which should be suitable for communication, interpretation, or processing
by human or electronic machine.

➢ Data is represented with the help of characters such as alphabets (A-Z, a-z), digits (0-9) or
special characters (+,-, /,*,<,>,= etc.).
➢ Data is nothing but facts and statistics stored or free flowing over a network, generally It’s raw
and unprocessed.

WHAT IS INFORMATION ?

➢ Information is a set of data which is processed in a meaningful way according to the given
requirement. Information is processed, structured, or presented in a given context to make it
meaningful and useful.

➢ Information is processed, organised and structured data. It provides context for data and
enables decision making. For example, a single customer’s sale at a restaurant is data – this
becomes information when the business is able to identify the most popular or least popular
dish.

TYPES OF DATA :-

Data can be of two types :-

BASIC CONCEPTS
1) Qualitative data: It is non-numerical data. For eg., the texture of the skin, the colour of the
eyes, etc.
2) Quantitative data: Quantitative data is given in numbers. Data in the form of questions such as
“how much” and “how many”, gives the quantitative data.

Now we have to Discuss Difference between data and information with its Parameters: -
Parameters Data Information

Description Qualitative Or Quantitative It is a group of data which


Variables which helps to carries news and meaning.
develop ideas or
conclusions.

Etymology Data comes from a Latin Information word has old


word, datum, which means French and middle English
“To give something.” Over a origins. It has referred to
time “data” has become the the “act of informing.”. It is
plural of datum. mostly used for education
or other known
communication.

Format Data is in the form of Ideas and inferences


numbers, letters, or a set of
characters.
Represented in It can be structured, tabular Language, ideas,
data, graph, data tree, etc. andthoughts based on the
given data.
Meaning Data does not have any It carries meaning that has
specific purpose. been assigned by
interpreting data.
Interrelation Information that is collected Information that is
processed.

Feature Data is a single unit and is Information is the product


raw. It alone doesn’t have and group of data which
any meaning. jointly carry a logical
meaning.

BASIC CONCEPTS
Dependence It never depends on It depended on Data.
Measuring unit Information

Measuring unit Measured in bits and bytes. Measured in meaningful


units like time, quantity, etc.

Support for Decision It can’t be used for decision It is widely used for decision
making making making.

Contains Unprocessed raw factors Processed in a meaningful


way

Knowledge level It is low-level knowledge. It is the second level of


knowledge.

Characteristic Data is the property of an Information is available for


organization and is not sale to the public.
available for sale to the
public.

Dependency Data depends upon the Information depends upon


sources for collecting data. data.

Example Ticket sales on a band on Sales report by region and


tour. venue. It gives information
which venue is profitable for
that business.

Significance Data alone has no Information is significant by


significance. itself.

Meaning Data is based on records Information is considered


and observations and, more reliable than data. It
which are stored in helps the researcher to
computers or remembered conduct a proper analysis.
by a person.
Usefulness The data collected by the Information is useful and

BASIC CONCEPTS
researcher, may or may not valuable as it is readily
be useful. available to the researcher
for use.

Dependency Data is never designed to Information is always


the specific need of the specific to the requirements
user. and expectations because
all the irrelevant facts and
figures are removed, during
the transformation process.

DATABASE AND DBMS :-

What is Database ?
➢ A Database is a collection of inter-related (logically-related) data.
➢ The database is a collection of inter-related data which is used to retrieve, insert and delete the
data efficiently. It is also used to organize the data in the form of a table, schema, views, and
reports, etc. Like Books Database in Library, The College Database organizes the data about the
admin, staff, students and faculty etc.

BASIC CONCEPTS
➢ Using the database, you can easily retrieve, insert, and delete the information.
➢ Mostly data represents record able facts. Data aids in producing information, which is based on
facts. For example, if we have data about marks obtained by all students, we can then conclude
about toppers and average marks.

DATABASE MANAGEMENT SYSTEM :-

➢ Database management system is software which is used to manage the database. For example:
MySQL, Oracle, SQL Server, IBM DB2 etc. are a very popular commercial database which is used
in different applications.
➢ DBMS provides an interface to perform various operations like database creation, storing data
in it, updating data, creating a table in the database and a lot more.
➢ It provides protection and security to the database. In the case of multiple users, it also
maintains data consistency.
➢ A DBMS stores data in such a way that it becomes easier to retrieve, manipulate, and produce
information.

DBMS = Database + Management System.


A Database Management System stores data in such a way that it becomes easier to retrieve,
manipulate, and produce information.
Database Management System or DBMS in short refers to the technology of storing and
retrieving users data with utmost efficiency along with appropriate security measures.

BASIC CONCEPTS
WHY TO LEARN DBMS ?

Traditionally, data was organized in file formats. DBMS was a new concept then, and all the
research was done to make it overcome the deficiencies in traditional style of data
management.

A MODERN DBMS HAS THE FOLLOWING CHARACTERISTICS: -

➢ Real-world entity − A modern DBMS is more realistic and uses real-world entities to design its
architecture. It uses the behavior and attributes too. For example, a school database may use
students as an entity and their age as an attribute.

➢ Relation-based tables − DBMS allows entities and relations among them to form tables. A user
can understand the architecture of a database just by looking at the table names.

➢ Isolation of data and application − A database system is entirely different than its data. A
database is an active entity, whereas data is said to be passive, on which the database works
and organizes. DBMS also stores metadata, which is data about data, to ease its own process.

➢ Less redundancy − DBMS follows the rules of normalization, which splits a relation when any of
its attributes is having redundancy in values. Normalization is a mathematically rich and
scientific process that reduces data redundancy.

➢ Consistency − Consistency is a state where every relation in a database remains consistent.


There exist methods and techniques, which can detect attempt of leaving database in
inconsistent state. A DBMS can provide greater consistency as compared to earlier forms of
data storing applications like file-processing systems.

➢ Query Language − DBMS is equipped with query language, which makes it more efficient to
retrieve and manipulate data. A user can apply as many and as different filtering options as

BASIC CONCEPTS
required to retrieve a set of data. Traditionally it was not possible where file-processing system
was used.

➢ Database Users − A typical DBMS has users with different rights and permissions who use it for
different purposes. Some users retrieve data and some back it up. The users of a DBMS can be
broadly categorized as follows.

➢ Administrators − Administrators maintain the DBMS and are responsible for administrating the
database. They are responsible to look after its usage and by whom it should be used. They
create access profiles for users and apply limitations to maintain isolation and force security.
Administrators also look after DBMS resources like system license, required tools, and other
software and hardware related maintenance.

➢ Designers − Designers are the group of people who actually work on the designing part of the
database. They keep a close watch on what data should be kept and in what format. They
identify and design the whole set of entities, relations, constraints, and views.

➢ End Users − End users are those who actually reap the benefits of having a DBMS. End users
can range from simple viewers who pay attention to the logs or market rates to sophisticated
users such as business analysts.

BASIC CONCEPTS
KEY TAKEAWAYS:-
➢ Data is an unorganized way or unstructured things. It’s a raw material from where we can
gather useful information.
➢ Information is processed data.
➢ Database is a combination of Data + DBMS.
➢ DBMS Stands for Database Management System.
➢ Various types of DBMS Software are available some of them are freeware and some of them
are proprietary product of particular companies.
➢ DBMS Software is used to store data in a particular format or particular information.
➢ In DBMS Software Data can be stored in a tabular format in terms of Rows and Columns.
➢ Two things we need to specify at the time of creation of a table they are Field Name and Data
Types.

BASIC CONCEPTS
BASIC CONCEPTS

BASIC CONCEPTS
SUB LESSON 1.2

PURPOSE OF DATABASE

PURPOSE OF DATABASE SYSTEM :-

It is a collection of tools that enable users to create and manage databases. In other words, it is
general-purpose software that allows users to create, manipulate, and design databases for a
number of purposes.

Database systems are design to deal with large volumes of data. Data management comprises
both the construction of data storage systems and the provision of data manipulation methods.
Furthermore, the database system must maintain the security of the information held despite
system crashes or attempts at unauthorized access. The system must avoid any unexpected
effects if data is to be shared across multiple users.

The database applications were built on top of the file system.


The goal of a database management system (DBMS) is to transform the following:
1. Data into information.
2. Information into knowledge.
3. Knowledge of the action.

CHARACTERISTICS OF DBMS :-

➢ Firstly, it manages and stores information in a server-based digital repository.


➢ Secondly, it can logically and visibly represent the data transformation process.
➢ Automatic backup and recovery techniques are built into the database management
system.
➢ It has ACID features, which ensure that data is safe even if the system fails.
➢ It has the ability to make complex data connections more understandable.

BASIC CONCEPTS
➢ It’s utilized to help with data manipulation and processing.
➢ It is utilize to keep information safe.
➢ Lastly, It can examine the database from a variety of perspectives, depending on the
needs of the user.

ADVANTAGES OF DBMS:-

➢ Firstly, because it saves all of the data in a single database file and that record data is
saves in the database, it can control database redundancy.
➢ Data sharing: Authorized users of a database management system can share data with a
large number of other users.
➢ The database system is relatively easy to maintain due to its centralized architecture.
➢ It saves time by lowering the time it takes to create a product as well as the time it takes
to maintain it.
➢ Backup: It consists of backup and recovery subsystems that create automatic data
backups in the case of hardware or software failures and restore the data if necessary.
➢ Lastly, it offers graphical user interfaces and application program interfaces, among
other options.

DISADVANTAGES OF DBMS :-

➢ Hardware and software costs: To operate DBMS software, you’ll need a fast data
processor and a lot of memory.
➢ Size: To run them efficiently, it takes up a lot of disc space and RAM.
➢ Complexity: The database system adds to the complexity and demands.
➢ Failure has a greater impact on the database since most organizations keep all of their
data in a single database, and if the database is damage due to an electric outage or database
corruption, the data might lost permanently.

APPLICATIONS OF DBMS :-

➢ Railway Reservation System − The railway reservation system database plays a very important
role by keeping record of ticket booking, train’s departure time and arrival status and also gives
information regarding train late to people through the database.

BASIC CONCEPTS
➢ Library Management System − Now-a-days it’s become easy in the Library to track each book
and maintain it because of the database. This happens because there are thousands of books in
the library. It is very difficult to keep a record of all books in a copy or register. Now DBMS used
to maintain all the information related to book issue dates, name of the book, author and
availability of the book.
➢ Banking − Banking is one of the main applications of databases. We all know there will be a
thousand transactions through banks daily and we are doing this without going to the bank.
This is all possible just because of DBMS that manages all the bank transactions.
➢ Universities and colleges − Now-a-days examinations are done online. So, the universities and
colleges are maintaining DBMS to store Student’s registrations details, results, courses and
grade all the information in the database. For example, telecommunications. Without DBMS
there is no telecommunication company. DBMS is most useful to these companies to store the
call details and monthly postpaid bills.
➢ Credit card transactions − The purchase of items and transactions of credit cards are made
possible only by DBMS. A credit card holder has to know the importance of their information
that all are secured through DBMS.
➢ Social Media Sites − By filling the required details we are able to access social media platforms.
Many users sign up daily on social websites such as Facebook, Pinterest and Instagram. All the
information related to the users are stored and maintained with the help of DBMS.
➢ Finance − Now-a-days there are lots of things to do with finance like storing sales, holding
information and finance statement management etc. these all can be done with database
systems.
➢ Military − In military areas the DBMS is playing a vital role. Military keeps records of soldiers
and it has so many files that should be kept secure and safe. DBMS provides a high security to
military information.
➢ Online Shopping − Now-a-days we all do Online shopping without wasting the time by going
shopping with the help of DBMS. The products are added and sold only with the help of DBMS
like Purchase information, invoice bills and payment.

BASIC CONCEPTS
➢ Human Resource Management − The management keeps records of each employee’s salary,
tax and work through DBMS.
➢ Manufacturing − Manufacturing companies make products and sell them on a daily basis. To
keep records of all those details DBMS is used.
➢ Airline Reservation system − Just like the railway reservation system, airlines also need DBMS
to keep records of flights arrival, departure and delay status.
So finally, we can clearly conclude that the DBMS is playing a very important role in each and
every field.

KEY TAKEAWAYS
➢ Purpose of Database is used to store data in a particular format and particular way.
➢ Organizing data in an appropriate way that’s the reason for using Database.
➢ Data can be Added, Modified, Deleted based on the requirement of the user.
➢ Adding a record in a Database table is called as Insertion of Data.
➢ Updating a record in a Database table is called Updating of Data.
➢ Deleting a record from a Database table is called Deleting of Data.
➢ ACID Property is very important in Database Terminology.
➢ In ACID Property

BASIC CONCEPTS
➢ A Stands for Atomicity ach statement in a transaction (to read, write, update or delete data) is
treated as a single unit. Either the entire statement is executed, or none of it is executed. This
property prevents data loss and corruption from occurring if, for example, if you’re streaming
data source fails mid-stream.
➢ C Stands for Consistency ensures that transactions only make changes to tables in predefined,
predictable ways. Transactional consistency ensures that corruption or errors in your data do
not create unintended consequences for the integrity of your table.
➢ I Stands for Isolation when multiple users are reading and writing from the same table all at
once, isolation of their transactions ensures that the concurrent transactions don't interfere
with or affect one another. Each request can occur as though they were occurring one by one,
even though they're actually occurring simultaneously.
➢ D Stands for Durability ensures that changes to your data made by successfully executed
transactions will be saved, even in the event of system failure.

BASIC CONCEPTS
BASIC CONCEPTS

ADVANTAGES OF DBMS 23
SUB LESSON 1.3

ADVANTAGES OF DBMS

ADVANTAGES OF DBMS ARE LISTED BELOW : -

➢ Better use of data or information - We can easily and efficiently access well-managed
and synchronized forms of data with the help of DBMS. It makes data handling simple, provides
an integrated perspective of how a certain business is operating and also aids in keeping track
of how one element of the business affects another portion.

➢ Secured Data - The likelihood of security problems increases as a database becomes


more functional and accessible. The danger to data security rises as a result of the rate at which
data is shared or transferred growing along with the user base. It is frequently utilized in the
business world where organizations spend a lot of time, money, and effort making sure data is
protected and handled effectively. Data management systems (DBMS) offer a stronger
framework for data privacy and security policies, assisting businesses in enhancing data
security.

➢ Reduces Data Inconsistency and Redundancy - The major issues faced during the
process of storing data are inconsistency and redundancy. Inconsistent data may lead to a big
loss to an individual or a business model and the storage capacity is not utilized properly
because of the data redundancy. When multiple copies with different versions or values of the
same data exist in various locations, then it causes inconsistency. Data Redundancy and
inconsistency can both be significantly decreased by properly designing a database with the
help of a database management system.

ADVANTAGES OF DBMS 24
➢ Data consistency- Data consistency is ensured in a database because there is no data
redundancy. All data appears consistently across the database and the data is same for all the
users viewing the database. Moreover, any changes made to the database are immediately
reflected to all the users and there is no data inconsistency.

➢ Better data integration- Data intergrity means that the data is accurate and consistent
in the database. Data Integrity is very important as there are multiple databases in a DBMS. Due
to the database Management system we have an access to well managed and synchronized
form of data thus it makes data handling very easy and gives an integrated view of how a
particular organization is working and also helps to keep a track of how one segment of the
company affects another segment.

➢ Better Recovery and Backups - Backup and recovery are handled automatically by the
DBMS. Users don't need to regularly back up their data because the DBMS handles this for
them. Additionally, it returns the database to its prior state following a crash or system failure.

➢ Fast Data Sharing - Database administration makes it possible for consumers to access
more and better-managed data. DBMS enables end users to quickly scan their environment and
react to any alterations made there.

➢ Helps in decision-making - Because of the well-managed data and improved data access
provided by DBMS, we are able to produce better-quality information and, as a result, make
better judgments. Accuracy, validity, and the time it takes to read data are all improved by
better data quality. Although DBMS does not ensure data quality, it does offer a framework
that makes it simple to enhance data quality.

➢ Increases Privacy - The privacy rule in a database specifies the privacy restrictions that
can only be accessed by authorized users. A user can only view the data he is permitted to view

ADVANTAGES OF DBMS 25
since there are different degrees of database access. For instance, on social networking sites,
different accounts that a user wishes to access have varying access restrictions and a user can
only see his/her account details, not others.

➢ User Friendly - Data are presented in a straightforward and logical manner by database
management systems (DBMS). It is simple to carry out many activities, such as the addition,
deletion, or creation of files or data.

➢ Data Abstraction - In order to give users an abstract overview of the data, database
systems are primarily used. Since numerous intricate algorithms are employed by developers to
boost the effectiveness of databases that are concealed from users by several degrees of data
abstraction, consumers can easily engage with the system.

KEY TAKEAWAYS
➢ Using database technologies we can secure data.
➢ All ACID Properties are being followed by DBMS Software.
➢ Normalization is also maintained by DBMS Software.
➢ Redundancy of Data will not be a problem in DBMS Software.

ADVANTAGES OF DBMS 26
DATA MODELS

INTRODUCTION TO DATA MODELS 27


SUB LESSON 2.1

INTRODUCTION OF DATA MODELS


INTRODUCTION :-

➢ Data models define how the logical structure of a database is modelled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.
➢ The very first data model could be flat data-models, where all the data used are to be kept in
the same plane. Earlier data models were not so scientific , hence they were prone to
introduce lots of duplication and update anomalies.
➢ Data models are used to describe how the data is stored, accessed, and updated in a DBMS. A
set of symbols and text is used to represent them so that all the members of an organization
can understand how the data is organized. It provides a set of conceptual tools that are vastly
used to represent the description of data.
➢ The Data Model gives us an idea of how the final system would look after it has been fully
implemented. It specifies the data items as well as the relationships between them. In a
database management system, data models are often used to show how data is connected,
stored, accessed, and changed. We portray the information using a set of symbols and language
so that members of an organisation may understand and comprehend it and then
communicate.
➢ Data Model is the modelling of the data description, data semantics, and consistency
constraints of the data. It provides the conceptual tools for describing the design of a database
at each level of data abstraction.

INTRODUCTION TO DATA MODELS 28


TYPES OF DATA MODELS:-

➢ Hierarchical Model
➢ Network Model
➢ Entity-Relationship Model
➢ Relational Model
➢ Object-Oriented Data Model

ADVANTAGES OF DATA MODELS IN DBMS :-

➢ Data models ensure that the data is represented accurately.


➢ The relationship between the data is well-defined.
➢ Data redundancy can be minimized and missing data can be identified easily.
➢ Last but not least, the security of the data is not compromised.

DISADVANTAGES OF DATA MODELS IN DBMS :-

➢ The biggest disadvantage of the data model is, one must know the characteristics of
physical data to build a data model.
➢ Sometimes in big databases, it is quite difficult to understand the data model also the
cost incurred is very high.

KEY TAKEAWAYS:-

INTRODUCTION TO DATA MODELS 29


➢ Data Model describes how the logical structure of a database is modelled. Data Models are
fundamental entities to introduce abstraction in a DBMS. Data models define how data is
connected to each other and how they are processed and stored inside the system.
Various types of Data Models are:
➢ Hierarchical Model
➢ Network Model
➢ Entity-Relationship Model
➢ Relational Model
➢ Object-Oriented Data Model
➢ Main Advantage of Data Models is Data redundancy can be minimized and missing data can be
identified easily.
➢ The biggest disadvantage of the data model is, one must know the characteristics of physical
data to build a data model.

INTRODUCTION TO DATA MODELS 30


DATA MODELS

TYPES OF DATA MODELS


SUB LESSON 2.2

TYPES OF DATA MODELS


TYPES OF DATA MODELS:-

There are 4 different types of data models :-

1. HIERARCHICAL MODEL:-

➢ In this type of data model, the data is organized into a tree-like structure that has a single root
and the data is linked to the root. In this model, the main hierarchy begins from the root and it
expands like a tree that has child nodes and further expands in the same manner. In this model
the child node has one single parent node but one parent can have multiple child nodes. As the
data is stored like tree structure in this data model when data is retrieved the whole tree is
traversed from the root node. The hierarchical data model contains a one-to-many relationship
between various types of data. The data is stored in the form of a record and is connected
through links.

For Example- there is an organization that has a requirement to store the information of its
employees. The table contains the following attributes: employee name, employee code,
department name, and last name. And the organization provides a computer for each
employee. So there is a requirement for storing information on a computer which is stored in a
separate table. The computer table store employee code, serial number, and type. According to
the hierarchical data model, the employee table can be considered as a parent table and a
computer table can be considered as a child node.

TYPES OF DATA MODELS


2. NETWORK MODEL :-

➢ The network model is a type of database model which is designed based on a flexible approach
for representing objects and the relationship exist among objects. The schema is very important
in the network data model which can be represented in the form of a graph where a
relationship is represented using edges and the nodes are used to represent objects. The basic
difference between the hierarchical model and network model is that data is represented in the
form of hierarchy in a hierarchical data model whereas in network model the data is
represented in the form of a graph. One of the advantages of a network model is that the basic
connections are also represented in this data model. There are different types of relationship
can exist in this data model like one to one, many to many, etc. The data access becomes
simple to compare to other data model like the hierarchical model. The parent node and child
node are always connected as there is always a relationship exist among parent-child node. And
the data is not dependent on the other node. One of the key drawbacks of this model is that
this system is not adaptive toward changes. It means when there is a requirement of some
modification of system it requires to change the whole system which takes a lot of effort. And
to maintain data is difficult to part in this model as every record is connected via some pointers
which makes it difficult to maintain and make the system complex.

TYPES OF DATA MODELS


3. E-R MODEL :-

➢ The ER model is used to describe the database structure using the entity-relationship diagram.
The E-R model is just like the blueprint of a database which is used to implement the database.
In the entity set, the relationship exists which can be shown using the ER diagram. The entity
set consists of a similar type of entities that consist of attributes.
➢ The components of the ER model are relationship set and entity set and attributes. The entity is
the component of data which is represented as a rectangle in the ER diagram. For example,
there are two Entities College and Student and there exist one too many relationships as there
can be more than one student who can go to college.

TYPES OF DATA MODELS


➢ The entity which cannot be identified by attributes and which require a relationship is called
a weak entity. For representing a weak entity the entity is represented in a double
rectangle. For example, there is a bank account but it cannot relate until the bank name is
not known to the bank account is termed as a weak entity.
➢ The attributes are used to represent the property of the entity. In the ER diagram, the
attribute is represented as an oval. There are different types of attributes like key attribute,
composite attribute, multi valued attribute and derived attribute. For example, a student is
an entity and the related attributes for student entity are student name, student age,
student roll number, student address, etc.
➢ The relationship is represented in a diamond shape in the ER diagram. The relationships
exist among entities. There are multiple types of relationships like one to one, one to many,
many to one, and many to many.

TYPES OF DATA MODELS


4. RELATIONAL MODEL :-

➢ In this data model, the data tables are used to collect a group of elements into the relations. In
this model, the relationships and data are represented using interrelated tables. And in the
table, there are multiple rows and multiple columns in which column represent the attribute of
the entity and the rows are used to represent records. In this data model there exist different
primary key which issued to distinguish each record in the table. And for retrieving the data
elements the SQL (Structured Query Language) is used. For using the relational data model the
primary key issued as the fundamental tool. And for each entry in the data set, it needs to be
unique. The data table should not contain any type of inconsistency as it can create a problem
at the time of data retrieval. The other problem with the relational data model is data duplicity,
incomplete data and inappropriate links used to connect data.
Stu. Id Name Branch
101 Naman CSE
102 Saloni ECE
103 Rishabh IT
104 Pulkit ME

KEY TAKEAWAYS:-

Types of Data Models are:

1. HIERARCHICAL MODEL :-

In this type of data model, the data is organized into a tree-like structure that has a single root
and the data is linked to the root.

2. NETWORK MODEL :-

The network model is a type of database model which is designed based on a flexible approach
for representing objects and the relationship exist among objects.

TYPES OF DATA MODELS


3. E-R MODEL :-

The ER model is used to describe the database structure using the entity-relationship diagram.
The E-R model is just like the blueprint of a database which is used to implement the database.

4. RELATIONAL MODEL :-

In this data model, the data tables are used to collect a group of elements into the relations. In
this model, the relationships and data are represented using interrelated tables.

TYPES OF DATA MODELS


DATA MODELS

THREE LEVEL ARCHITECTURE


SUB LESSON 2.3

THREE LEVEL ARCHITECTURE

THREE LEVEL ARCHITECTURE :-

➢ The three schema architecture is also called ANSI/SPARC architecture or three-level


architecture.
➢ This framework is used to describe the structure of a specific database system.
➢ The three schema architecture is also used to separate the user applications and
physical database.
➢ The three schema architecture contains three-levels. It breaks the database down into
three different categories.
The three-schema architecture is as follows:

In the above diagram:

THREE LEVEL ARCHITECTURE


➢ It shows the DBMS architecture.
➢ Mapping is used to transform the request and response between various database
levels of architecture.
➢ Mapping is not good for small DBMS because it takes more time.
➢ In External / Conceptual mapping, it is necessary to transform the request from external
level to conceptual schema.
➢ In Conceptual / Internal mapping, DBMS transform the request from the conceptual to
internal level.

OBJECTIVES OF THREE SCHEMAS ARCHITECTURE :-

The main objective of three level architecture is to enable multiple users to access the same
data with a personalized view while storing the underlying data only once. Thus it separates the
user's view from the physical structure of the database. This separation is desirable for the
following reasons :-
➢ Different users need different views of the same data.
➢ The approach in which a particular user needs to see the data may change over time.
➢ The users of the database should not worry about the physical implementation and
internal workings of the database such as data compression and encryption techniques,
hashing, optimization of the internal structures etc.
➢ All users should be able to access the same data according to their requirements.
➢ DBA should be able to change the conceptual structure of the database without
affecting the user's
➢ Internal structure of the database should be unaffected by changes to physical aspects
of the storage.

1. INTERNAL LEVEL :-

THREE LEVEL ARCHITECTURE


➢ The internal level has an internal schema which describes the physical storage structure
of the database.
➢ The internal schema is also known as a physical schema.
➢ It uses the physical data model. It is used to define that how the data will be stored in a
block.
➢ The physical level is used to describe complex low-level data structures in detail.
The internal level is generally is concerned with the following activities:
➢ Accesspaths.
For Example: Specification of primary and secondary keys, indexes, pointers and sequencing.
➢ Data compression and encryption techniques.
➢ Optimization of internal structures.
➢ Representation of stored fields.

2. CONCEPTUAL LEVEL :-

➢ The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.

THREE LEVEL ARCHITECTURE


➢ The conceptual schema describes the structure of the whole database.
➢ The conceptual level describes what data are to be stored in the database and also
describes what relationship exists among those data.
➢ In the conceptual level, internal details such as an implementation of the data structure
are hidden.
➢ Programmers and database administrators work at this level.

3. EXTERNAL LEVEL :-

➢ At the external level, a database contains several schemas that sometimes called as
subschema. The subschema is used to describe the different view of the database.
➢ An external schema is also known as view schema.
➢ Each view schema describes the database part that a particular user group is interested
and hides the remaining database from that user group.
➢ The view schema describes the end user interaction with database systems.

MAPPING BETWEEN VIEWS :-

The three levels of DBMS architecture don't exist independently of each other. There must be
correspondence between the three levels i.e. how they actually correspond with each other.
DBMS is responsible for correspondence between the three types of schema. This
correspondence is called Mapping.

There are basically two types of mapping in the database architecture:


➢ Conceptual/ Internal Mapping
➢ External / Conceptual Mapping

THREE LEVEL ARCHITECTURE


Conceptual/ Internal Mapping
The Conceptual/ Internal Mapping lies between the conceptual level and the internal level. Its
role is to define the correspondence between the records and fields of the conceptual level and
files and data structures of the internal level.
External/ Conceptual Mapping
The external/Conceptual Mapping lies between the external level and the Conceptual level. Its
role is to define the correspondence between a particular external and the conceptual view.
Three level architecture:-
➢ The three schema architecture is also called ANSI/SPARC architecture or three-level
architecture.
➢ This framework is used to describe the structure of a specific database system.
➢ The three schema architecture is also used to separate the user applications and
physical database.
➢ The three schema architecture contains three-levels. It breaks the database down into
three different categories.

The three-schema architecture is as follows

In the above diagram:

THREE LEVEL ARCHITECTURE


➢ It shows the DBMS architecture.
➢ Mapping is used to transform the request and response between various database
levels of architecture.
➢ Mapping is not good for small DBMS because it takes more time.
➢ In External / Conceptual mapping, it is necessary to transform the request from external
level to conceptual schema.
➢ In Conceptual / Internal mapping, DBMS transform the request from the conceptual to
internal level.

OBJECTIVES OF THREE SCHEMAS ARCHITECTURE :-

The main objective of three level architecture is to enable multiple users to access the same
data with a personalized view while storing the underlying data only once. Thus it separates the
user's view from the physical structure of the database. This separation is desirable for the
following reasons:
➢ Different users need different views of the same data.
➢ The approach in which a particular user needs to see the data may change over time.
➢ The users of the database should not worry about the physical implementation and
internal workings of the database such as data compression and encryption techniques,
hashing, optimization of the internal structures etc.
➢ All users should be able to access the same data according to their requirements.
➢ DBA should be able to change the conceptual structure of the database without
affecting the user's
➢ Internal structure of the database should be unaffected by changes to physical aspects
of the storage.

KEY TAKEAWAYS :-

OBJECTIVES OF THREE SCHEMA ARCHITECTURE :-

The main objective of three level architecture is to enable multiple users to access the same
data with a personalized view while storing the underlying data only once.

THREE LEVEL ARCHITECTURE


1. INTERNAL LEVEL
• The internal level has an internal schema which describes the physical storage structure
of the database.

2. CONCEPTUAL LEVEL
• The conceptual schema describes the design of a database at the conceptual level.
Conceptual level is also known as logical level.

3. EXTERNAL LEVEL
• At the external level, a database contains several schemas that sometimes called as sub
schema. The sub schema is used to describe the different view of the database.

THREE LEVEL ARCHITECTURE


DATA MODELS

THREE LEVEL ARCHITECTURE 46


SUB LESSON 2.4

VARIOUS COMPONENTS OF DBMS


A Database Management System (DBMS) is software that manages and organizes data in a
database. It provides various functionalities for creating, retrieving, updating, and managing
data. The key components of a typical DBMS include:
➢ Data Definition Language (DDL): DDL is a subset of SQL (Structured Query Language) used to
define the structure of a database. It includes commands like CREATE, ALTER, DROP, and
TRUNCATE to define tables, indexes, and constraints.
➢ Data Manipulation Language (DML): DML is used for manipulating data stored in the database.
Common DML commands include SELECT, INSERT, UPDATE, and DELETE.
➢ Data Query Language (DQL): DQL is used to retrieve data from the database. The most
common DQL command is SELECT.
➢ Data Control Language (DCL): DCL is responsible for controlling access to data within the
database. It includes commands like GRANT and REVOKE, which grant or revoke privileges and
permissions to users.
➢ Transaction Management and Concurrency Control: These components ensure data
consistency in multi-user environments. They handle the execution of multiple transactions and
maintain data integrity through mechanisms like locking and isolation levels.
➢ Query Optimization: The query optimizer is responsible for analyzing and optimizing SQL
queries to improve their execution efficiency. It determines the best way to retrieve data from
the database.
➢ Storage Manager: The storage manager handles the physical storage of data on storage
devices. It manages data files, storage allocation, and data retrieval.
➢ Buffer Manager: The buffer manager is responsible for managing a buffer pool in memory. It
keeps frequently accessed data in memory to reduce the I/O operations and improve query
performance.

THREE LEVEL ARCHITECTURE 47


➢ Security and Authorization: DBMS systems have built-in security features to control access to
the data. They allow database administrators to set permissions and control who can access
and modify data.
➢ Data Dictionary or Metadata Repository: This component stores information about the
database structure, schema, tables, indexes, and more. It is used for data management and
schema management.
➢ Backup and Recovery Manager: This component is responsible for creating backups of the
database to prevent data loss and for recovering the database in case of failures or data
corruption.
➢ Report Generator: Some DBMS systems have built-in tools or support for generating reports
and analytics based on the data stored in the database.
➢ Client-Server Architecture: DBMS often use a client-server model where the database server
handles data storage and retrieval while clients (applications or users) interact with the
database server to perform operations.
➢ Data Models: DBMS can support various data models, including relational, hierarchical,
network, and object-oriented, among others.
➢ Database Administration Tools: These tools provide database administrators with a user-
friendly interface to manage and monitor the database system.
➢ Concurrency and Transaction Control: DBMS systems have mechanisms to manage concurrent
access to the data to ensure data consistency and integrity, including locking and transaction
management.

These components work together to provide efficient data management, data retrieval, and
data security within a database management system. The specific features and capabilities of a
DBMS may vary based on the system you choose, as there are various types of DBMS, such as
relational databases (e.g., MySQL, PostgreSQL, Oracle), NoSQL databases (e.g., MongoDB,
Cassandra), and more, each with its own unique features and components.

THREE LEVEL ARCHITECTURE 48


KEY TAKEAWAYS:-

Data Definition Language (DDL): DDL is a subset of SQL (Structured Query Language) used to
define the structure of a database. It includes commands like CREATE, ALTER, DROP, and
TRUNCATE to define tables, indexes, and constraints.

Data Manipulation Language (DML): DML is used for manipulating data stored in the database.
Common DML commands include SELECT, INSERT, UPDATE, and DELETE.

Data Query Language (DQL): DQL is used to retrieve data from the database. The most
common DQL command is SELECT.

Data Control Language (DCL): DCL is responsible for controlling access to data within the
database. It includes commands like GRANT and REVOKE, which grant or revoke privileges and
permissions to users.

THREE LEVEL ARCHITECTURE 49


OVERVIEW

FUNDAMENTALS OF MARKETING MANAGEMENT 50


SUB LESSON 3.1

PARALLEL DATABASE

PARALLEL DATABASE :-

➢ A parallel database is one which involves multiple processors and working in parallel on the
database used to provide the services.
➢ A parallel database system seeks to improve performance through parallelization of various
operations like loading data, building index and evaluating queries parallel systems improve
processing and I/O speeds by using multiple CPU’s and disks in parallel.
➢ Nowadays organizations need to handle a huge amount of data with a high transfer rate. For
such requirements, the client-server or centralized system is not efficient. With the need to
improve the efficiency of the system, the concept of the parallel database comes in picture. A
parallel database system seeks to improve the performance of the system through parallelizing
concept.
NEED :-
Multiple resources like CPUs and Disks are used in parallel. The operations are performed
simultaneously, as opposed to serial processing. A parallel server can allow access to a single
database by users on multiple machines. It also performs many parallelization operations like
data loading, query processing, building indexes, and evaluating queries.

WORKING OF PARALLEL DATABASE :-

Let us discuss how parallel database works in step by step manner –


➢ Step 1 − Parallel processing divides a large task into many smaller tasks and executes the
smaller tasks concurrently on several CPU’s and completes it more quickly.
➢ Step 2 − The driving force behind parallel database systems is the demand of applications that
have to query extremely large databases of the order of terabytes or that have to process a
large number of transactions per second.

FUNDAMENTALS OF MARKETING MANAGEMENT 51


➢ Step 3 − In parallel processing, many operations are performed simultaneously as opposed to
serial processing, in which the computational steps are performed sequentially.
This working of parallel database is explained in the diagram given below

PERFORMANCE MEASURES :-

There are two main resources of performance of a database system, which are explained below
➢ Throughput − the number of tasks that can be completed in a given time interval. A
system that processes a large number of small transactions can improve throughput by
processing many transactions in parallel.
➢ Response time − the amount of time it takes to complete a single task from the time it is
submitted. A system that processes large transactions can improve response time, as well as
throughput by performing subtasks of each transaction in parallel.

BENEFITS OF PARALLEL DATABASE :-

The benefits of the parallel database are explained below −

SPEED :-

Speed is the main advantage of parallel databases. The server breaks up a request for a user
database into parts and sends each part to a separate computer.
We eventually function on the pieces and combine the outputs, returning them to the
customer. It speeds up most requests for data so that large databases can be reached more
easily.

CAPACITY :-

As more users request access to the database, the network administrators are adding more
machines to the parallel server, increasing their overall capacity.
For example, a parallel database enables a large online store to have at the same time access to
information from thousands of users. With a single server, this level of performance is not
feasible.

FUNDAMENTALS OF MARKETING MANAGEMENT 52


RELIABILITY :-

Despite the failure of any computer in the cluster, a properly configured parallel database will
continue to work. The database server senses that there is no response from a single computer
and redirects its function to the other computers.
Many companies, such as online retailers, want their database to be accessible as fast as
possible. This is where a parallel database stands good.
This method also helps in conducting scheduled maintenance on a computer-by-computer
technician. They send a server command to uninstall the affected device, then perform the
maintenance and update required.

BENEFITS FOR QUERIES :-

Parallel query processing can benefit the following types of queries −


➢ Select statements that scan large numbers of pages but output a few rows only.
➢ Select statements that include union, order by, or distinct, since these queries can
populate worktables in parallel, and can make use of parallel sorting.
➢ Select statements that use merge joins can use parallel processing for scanning tables
and also for sorting and merging.
➢ Select statements where the reformatting strategy is chosen by the optimizer, since
these can populate worktables in parallel, and can make use of parallel sorting.
➢ Create index statements, and the alter table - add constraint clauses that create
indexes, unique and primary keys.

KEY TAKEAWAYS

• A parallel database is one which involves multiple processors and working in parallel on the
database used to provide the services.
• Advantages of Parallel Database are Speed, Capacity, Reliability

FUNDAMENTALS OF MARKETING MANAGEMENT 53


OVERVIEW

DISTRIBUTED DATABASE 54
SUB LESSON 3.2

DISTRIBUTED DATABASE

➢ A distributed database is one that consists of two or more files spread across various places,
whether they are connected by the same network or not. The database is split up into
different physical places for storage and processing, and there are numerous database
nodes involved.
➢ Data is conceptually integrated by a centralised distributed database management system
(DDBMS) so that it may be managed as if it were all kept in one place. Periodically, the
DDBMS synchronises all the data, ensuring that deletions and modifications made at one
location will automatically appear in the data stored elsewhere.
➢ A centralised database, in contrast, consists of a single database file that is located at a
single location across a single network.

THE CHARACTERISTICS OF DISTRIBUTED DATABASES :-

Distributed databases are logically connected to one another when they are part of a collection,
and they frequently form a single logical database. Data is physically stored across various sites
and is separately handled in distributed databases. Each site's processors are connected to one
another via a network, but they are not set up for multiprocessing.
➢ A widespread misunderstanding is that a distributed database is equivalent to a loosely
coupled file system. It's considerably more complicated than that in reality. Although
distributed databases use transaction processing, they are not the same as systems that use it.
➢ Generally speaking, distributed databases have the following characteristics:
➢ Place unrelated
➢ spread-out query processing
➢ Distributed transaction management Independent of Hardware and Independent of OS
Systems
➢ independent of a network
➢ Transparency of transactions

DISTRIBUTED DATABASE 55
DISTRIBUTED DATABASE ARCHITECTURE :-

➢ Distributed databases can be homogenous or heterogeneous.

➢ All of the physical sites in a homogeneous distributed database system use the same operating
system and database software, as well as the same underlying hardware. It can be significantly
simpler to build and administer homogenous distributed database systems since they seem to
the user as a single system. The data structures at each location must either be the same or
compatible for a distributed database system to be considered homogeneous. Also, the
database programme utilised at each location must be compatible or identical.
➢ The hardware, operating systems, or database applications at each location may vary in a
heterogeneous distributed database. Although separate sites may employ various technologies
and schemas, a difference in schema might make query and transaction processing challenging.
➢ Nodes may differ in terms of technology, software, and data structures, or they may be in
incompatible geographical locations. Users may be able to access data stored at a different
place, but not upload or modify it. Because heterogeneous distributed databases are frequently
challenging to use, many organisations find them to be economically unviable.
➢ Distributed databases' benefits
Using distributed databases has a lot of benefits.
➢ As distributed databases support modular development, systems can be enlarged by putting
new computers and local data at a new location and seamlessly connecting them to the
distributed system.
➢ In centralised databases, failures result in a total shutdown of the system.
➢ When a component fails in distributed database systems, however, the system will continue to
function at reduced performance until the error is fixed.
➢ Admins can achieve lower communication costs for distributed database systems if the data is
located close to where it is used the most. This is not possible in centralized systems.

TYPES OF DISTRIBUTED DATABASES:-

DISTRIBUTED DATABASE 56
➢ Replicated data is used to create instances of data in different parts of the database. By using
replicated data, distributed databases can access identical data locally, thus avoiding traffic.
Replicated data can be divided into two categories: read-only and writable data.
➢ Read-only versions of replicated data allow revisions only to the first instance; subsequent
enterprise data replications are then adjusted. Writable data can be altered, but the first
instance is immediately changed.

➢ Horizontally fragmented data involves the use of primary keys that refer to one record in the
database. Horizontal fragmentation is usually reserved for situations in which business
locations only need to access the database pertaining to their specific branch.
➢ Vertically fragmented data involves using copies of primary keys that are available within each
section of the database and are accessible to each branch. Vertically fragmented data is utilized
when the branch of a business and the central location interact with the same accounts in
different ways.
➢ Reorganized data is data that has been adjusted or altered for decision support databases.
Reorganized data is typically used when two different systems are handling transactions and
decision support. Decision support systems can be difficult to maintain and online transaction
processing requires reconfiguration when many requests are being made.

DISTRIBUTED DATABASE 57
➢ Separate schema data partitions the database and the software used to access it in order to fit
different departments and situations. There is usually an overlap between different databases
within separate schema data.

EXAMPLES OF DISTRIBUTED DATABASES :-

➢ Though there are many distributed databases to choose from, some examples of distributed
databases include Apache Ignite, Apache Cassandra, Apache HBase, Couchbase Server, Amazon
SimpleDB, Clusterpoint, and FoundationDB.
➢ Amazon SimpleDB is used as a web service with Amazon Elastic Compute Cloud and Amazon S3.
Amazon SimpleDB enables developers to request and store data with minimal database
management and administrative responsibility.
➢ Clusterpoint removes the complexity, scalability issues and performance limitations
of relational database architectures. Data is managed in XLM or JSON format using open APIs.
Because Clusterpoint is a schema-free document database, it removes the scalability problems
and performance issues that most relational database architectures face.
➢ FoundationDB is a multimodel database designed around a core database that exposes an
ordered key valued store with each transaction. These transactions support ACID properties
and are capable of reading and writing keys that are stored on any machine within the cluster.
Additional features appear in layers around this core.

KEY TAKEAWAYS
➢ A distributed database is one that consists of two or more files spread across various places,
whether they are connected by the same network or not. The database is split up into different
physical places for storage and processing, and there are numerous database nodes involved.
➢ In centralised databases, failures result in a total shutdown of the system.
➢ When a component fails in distributed database systems, however, the system will continue to
function at reduced performance until the error is fixed.

Admins can achieve lower communication costs for distributed database systems if the data is
located close to where it is used the most. This is not possible in centralized systems.

DISTRIBUTED DATABASE 58
OVERVIEW

DISTRIBUTED DATABASE 59
SUB LESSON 3.3

OBJECT ORIENTED DATABASE

An Object-Oriented Database (OODB) is a type of database management system (DBMS) that is


designed to store and manage data in an object-oriented manner. In an OODB, data is
organized and stored as objects, much like in object-oriented programming, where objects are
instances of classes that encapsulate data and behavior. This approach allows for more natural
and efficient handling of complex and interconnected data structures.
Here are some key features and concepts associated with Object-Oriented Databases:
1. Objects: In OODBs, data is represented as objects. An object is a self-contained unit that
combines both data (attributes) and methods (functions) to manipulate that data. Objects in
OODBs can be hierarchically organized, and relationships between objects can be expressed
directly.
2. Classes: Objects are often grouped into classes, which define the structure and behavior of
objects. These classes serve as templates for creating new objects. Inheritance is a fundamental
concept, allowing one class to inherit attributes and methods from another, fostering code
reuse and maintaining a hierarchical structure.
3. Complex Data Types: OODBs support complex data types like arrays, lists, and sets, making it
easier to represent complex data structures in the database.
4. Encapsulation: Encapsulation is a key concept in OODBs. It allows you to bundle data and
behavior into objects, ensuring that data is protected and can only be accessed through defined
methods, promoting data integrity.
5. Object Identity: Objects in an OODB have a unique identity, which can be used to distinguish
them from other objects. This identity is typically provided by a system-generated object
identifier.
6. Query Language: OODBs often come with object query languages, which allow you to retrieve
data using object-oriented constructs. These query languages are well-suited for navigating and
querying complex data structures.

DISTRIBUTED DATABASE 60
7. Inheritance: Inheritance is a core concept in OODBs. Objects can inherit attributes and methods
from other objects, allowing for code reuse and building hierarchies of related objects.
8. Persistence: OODBs are designed to provide data persistence, which means that the data
stored in the database is durable and can be retrieved even after the application that created it
has terminated.
9. Concurrency Control and Transaction Management: OODBs include features for managing
concurrent access to data and for ensuring the consistency of data through transactions.
Examples of Object-Oriented Database Management Systems (OODBMS) include GemStone/S,
db4o, ObjectDB, and Versant, among others. These systems are particularly useful in scenarios
where the data model closely matches an object-oriented representation, such as in
applications with complex data structures, where maintaining the relationships between
objects is crucial.
It's worth noting that while OODBs offer advantages in certain scenarios, they are not as widely
used as relational databases (RDBMS) or NoSQL databases in most applications, and the choice
of database technology depends on the specific requirements of the project.

KEY TAKEAWAYS

Examples of Object-Oriented Database Management Systems (OODBMS) include GemStone/S,


db4o, Object Db, and Versant, among others.

Features that are supported by OODB are : Concurrency Control and Transaction Management,
Objects, Classes, Persistence, Complex Data Types, Encapsulation, Object Identity, Query
Language, Inheritance and Persistence.

DISTRIBUTED DATABASE 61
OVERVIEW

FUNDAMENTALS OF MARKETING MANAGEMENT 62


SUB LESSON 3.4

OBJECT RELATIONAL DATABASE

OBJECT RELATIONAL DATABASE :-

➢ An Object relational model is a combination of a Object oriented database model and a


Relational database model. So, it supports objects, classes, inheritance etc. just like Object
Oriented models and has support for data types, tabular structures, the query language used,
such as objects, classes and inheritance. etc. like Relational data model.
➢ An object-relational database may also be known as object relational database management
systems (ORDBMS).
➢ ORD is said to be the middleman between relational and object-oriented databases because it
contains aspects and characteristics from both models. In ORD, the basic approach is based on
RDB, since the data is stored in a traditional database and manipulated and accessed using
queries written in a query language like SQL. However, ORD also showcases an object-oriented
characteristic in that the database is considered an object store, usually for software that is
written in an object-oriented programming language. Here, APIs are used to store and access
the data as objects.
➢ One of ORD’s aims is to bridge the gap between conceptual data modelling techniques for
relational and object-oriented databases like the entity-relationship diagram (ERD) and object-
relational mapping (ORM). It also aims to connect the divide between relational databases and
the object-oriented modeling techniques that are usually used in programming languages like
Java, C# and C++.
➢ Traditional RDBMS products concentrate on the efficient organization of data that is derived
from a limited set of data-types. On the other hand, an ORDBMS has a feature that allows
developers to build and innovate their own data types and methods, which can be applied to
the DBMS. With this, ORDBMS intends to allow developers to increase the abstraction with
which they view the problem area.

FUNDAMENTALS OF MARKETING MANAGEMENT 63


HISTORY OF OBJECT RELATIONAL DATA MODEL :-
Both Relational data models and Object oriented data models are very useful. But it was felt
that they both were lacking in some characteristics and so work was started to build a model
that was a combination of them both. Hence, Object relational data model was created as a
result of research that was carried out in the 1990’s.

ADVANTAGES OF OBJECT RELATIONAL MODEL :-


The advantages of the Object Relational model are :-
➢ Inheritance
The Object Relational data model allows its users to inherit objects, tables etc. so that they can
extend their functionality. Inherited objects contains new attributes as well as the attributes
that were inherited.
➢ Complex Data Types
Complex data types can be formed using existing data types. This is useful in Object relational
data model as complex data types allow better manipulation of the data.
➢ Extensibility
The functionality of the system can be extended in Object relational data model. This can be
achieved using complex data types as well as advanced concepts of object oriented model such
as inheritance.

DISADVANTAGES OF OBJECT RELATIONAL MODEL :-

The object relational data model can get quite complicated and difficult to handle at times as it
is a combination of the Object oriented data model and Relational data model and utilizes the
functionalities of both of them.

KEY TAKEAWAYS
An Object relational model is a combination of a Object oriented database model and a
Relational database model.
Advantages of Object Relational Model can be Inheritance, Complex Data Types, Extensibility.

FUNDAMENTALS OF MARKETING MANAGEMENT 64


RELATIONAL STRUCTURE

RELATIONAL STRUCTURE
SUB LESSON 4.1

TABLES

Tables are essential objects in a database because they hold all the information or data. For
example, a database for a business can have a Contacts table that stores the names of their
suppliers, e-mail addresses, and telephone numbers. Because other database objects depend
so heavily on tables, you should always start your design of a database by creating all of its
tables and then creating any other objects. Before you create tables, consider your
requirements and determine all the tables that you might need. For an introduction to planning
and designing a database, see Database design basics.
A relational database like Access usually has several related tables. In a well-designed database,
each table stores data about a particular subject, such as employees or products. A table has
records (rows) and fields (columns). Fields have different types of data, such as text, numbers,
dates, and hyperlinks.
• SQL Table is a collection of data which is organized in terms of rows and columns. In
DBMS, the table is known as a relation and row as a tuple.
• Table is a simple form of data storage. A table is also considered a convenient
representation of relations.
Let's see an example of the EMPLOYEE table:

EMP_ID EMP_NAME CITY PHONE_NO

1 Kristen Washington 7289201223

2 Anna Franklin 9378282882

3 Jackson Bristol 9264783838

4 Kellan California 7254728346

5 Ashley Hawaii 9638482678

RELATIONAL STRUCTURE
In the above table, "EMPLOYEE" is the table name, "EMP_ID", "EMP_NAME", "CITY",
"PHONE_NO" are the column names. The combination of data of multiple columns forms a row,
e.g., 1, "Kristen", "Washington" and 7289201223 are the data of one row.

OPERATION ON TABLE :-

1. Create table
2. Drop table
3. Delete table
4. Rename table

SQL CREATE TABLE :-

SQL create table is used to create a table in the database. To define the table, you should define
the name of the table and also define its columns and column's data type.
-: SYNTAX :-
create table "table_name"
("column1" "data type",
"column2" "data type",
"column3" "data type",
...
"columnN" "data type");

-: EXAMPLE :-
SQL> CREATE TABLE EMPLOYEE (
EMP_ID INT NOT NULL,
EMP_NAME VARCHAR (25) NOT NULL,
PHONE_NO INT NOT NULL,
ADDRESS CHAR (30),
PRIMARY KEY (ID)
);
If you create the table successfully, you can verify the table by looking at the message by the
SQL server. Else you can use DESC command as follows:

RELATIONAL STRUCTURE
SQL> DESC EMPLOYEE;

Field Type Null Key Default Extra

EMP_ID int(11) NO PRI NULL

EMP_NAME varchar(25) NO NULL

PHONE_NO NO int(11) NULL

ADDRESS YES NULL char(30)

• 4 rows in set (0.35 sec)


• Now you have an EMPLOYEE table in the database, and you can use the stored
information related to the employees.

DROP TABLE :-

A SQL drop table is used to delete a table definition and all the data from a table. When this
command is executed, all the information available in the table is lost forever, so you have to
very careful while using this command.
-:SYNTAX:-
DROP TABLE "table_name";
Firstly, you need to verify the EMPLOYEE table using the following command:
SQL> DESC EMPLOYEE;

Field Type Null Key Default Extra

EMP_ID int(11) NO PRI NULL

EMP_NAME varchar(25) NO NULL

RELATIONAL STRUCTURE
PHONE_NO NO int(11) NULL

ADDRESS YES NULL char(30)

• 4 rows in set (0.35 sec)


This table shows that EMPLOYEE table is available in the database, so we can drop it as follows:
SQL>DROP TABLE EMPLOYEE;
Now, we can check whether the table exists or not using the following command:
Query OK, 0 rows affected (0.01 sec)
As this shows that the table is dropped, so it doesn't display it.

SQL DELETE TABLE :-

In SQL, DELETE statement is used to delete rows from a table. We can use WHERE condition to
delete a specific row from a table. If you want to delete all the records from the table, then you
don't need to use the WHERE clause.
-:SYNTAX:-
DELETE FROM table_name WHERE condition;

-:EXAMPLE:-
Suppose, the EMPLOYEE table having the following records

EMP_ID EMP_NAME CITY PHONE_NO SALARY

1 Kristen Chicago 9737287378 150000

2 Russell Austin 9262738271 200000

3 Denzel Boston 7353662627 100000

4 Angelina Denver 9232673822 600000

5 Robert Washington 9367238263 350000

6 Christian Los angels 7253847382 260000

RELATIONAL STRUCTURE
The following query will DELETE an employee whose ID is 2.

SQL> DELETE FROM EMPLOYEE


WHERE EMP_ID = 3;
Now, the EMPLOYEE table would have the following records.

EMP_ID EMP_NAME CITY PHONE_NO SALARY

1 Kristen Chicago 9737287378 150000

2 Russell Austin 9262738271 200000

4 Angelina Denver 9232673822 600000

5 Robert Washington 9367238263 350000

6 Christian Los angels 7253847382 260000


If you don't specify the WHERE condition, it will remove all the rows from the table.
DELETE FROM EMPLOYEE;

KEY TAKEAWAYS

Tables are essential objects in a database because they hold all the information or data.
Operation on Table:
1. Create table
2. Drop table
3. Delete table
4. Rename table
Create Table Syntax:
create table "table_name"
("column1" "data type",
"column2" "data type",
"column3" "data type",
...
"columnN" "data type");

RELATIONAL STRUCTURE
To See structure of a Table created:
DESC <Table Name>;

To Drop a Table:
Drop table <Table Name>;

To Delete a Table:
Delete table <Table Name>;

RELATIONAL STRUCTURE
RELATIONAL STRUCTURE

ENTITY SETS
SUB LESSON 4.2

ENTITY IN DBMS

➢ Database Management System (DBMS) is an essential tool to manage data, but do you know
how important entities are in DBMS?
The role of the entity is the representation and management of data. In this article, we are
going to discuss entities in DBMS.

ENTITY:-

➢ An entity is referred to as an object or thing that exists in the real world. For example,
customer, car, pen, etc.
➢ Entities are stored in the database, and they should be distinguishable, i.e., they should be
easily identifiable from the group. For example, a group of pens that are from the same
company cannot be identified, so they are only objects, but pens with different colours become
unique and will be called an entity like a red pen, green pen, blue pen, black pen, etc.
➢ In a group of pens, we can easily identify any pen because of its different colours, so a pen of
different colours is an entity.
➢ For extracting data from the database, each data must be unique in its own way so that it
becomes easier to differentiate between them. Distinct and unique data is known as an entity.
➢ An entity has some attributes which depict the entity's characteristics. For example, an entity
"Student" has attributes such as "Student_roll_no", "Student_name", "Student_subject", and
"Student_marks".

ENTITY SETS
EXAMPLE OF ENTITY IN DBMS IN TABULAR FORM :-

Student_rollno Student_name Student_subject Student_marks

1 Robert English 85

2 Parker Mathematics 75

3 Harry Science 80

4 George Geography 70

Some entities are related to other entities in the table. For example, the "Student" entity is
related to the "University" entity. The ERD (Entity Relationship Diagram) model comes to light
to visually show the relationship between several entities.

KINDS OF ENTITY:-

There are two kinds of entities, which are as follows :-

➢ Tangible Entity:
o It is an entity in DBMS, which is a physical object that we can touch or see. In simple words, an
entity that has a physical existence in the real world is called a tangible entity.
o For example, in a database, a table represents a tangible entity because it contains a physical
object that we can see and touch in the real world. It includes colleges, bank lockers, mobiles,
cars, watches, pens, paintings, etc.
➢ Intangible Entity:
o It is an entity in DBMS, which is a non-physical object that we cannot see or touch. In simple
words, an entity that does not have any physical existence in the real world is known as an
intangible entity.
o For example, a bank account logically exists, but we cannot see or touch it.

ENTITY SETS
ENTITY TYPE :-

➢ A collection of entities with general characteristics is known as an entity type.


➢ For example, a database of a corporate company has entity types such as employees,
departments, etc. In DBMS, every entity type contains a set of attributes that explain the entity.
➢ The Employee entity type can have attributes such as name, age, address, phone number, and
salary.
➢ The Department entity type can have attributes such as name, number, and location in the
department.

KINDS OF ENTITY TYPE :-

There are two kinds of entity type, which are as follows:


1. Strong Entity Type: It is an entity that has its own existence and is independent.
The entity relationship diagram represents a strong entity type with the help of a single
rectangle. Below is the ERD of the strong entity type:

ENTITY SETS
In the above example, the "Customer" is the entity type with attributes such as ID, Name,
Gender, and Phone Number. Customer is a strong entity type as it has a unique ID for each
customer.

2. Weak Entity Type: It is an entity that does not have its own existence and relies on a strong
entity for its existence. The Entity Relationship Diagram represents the weak entity type using
double rectangles. Below is the ERD of the weak entity type:

In the above example, "Address" is a weak entity type with attributes such as House No., City,
Location, and State.
The relationship between a strong and a weak entity type is known as an identifying
relationship.
Using a double diamond, the Entity-Relationship Diagram represents a relationship between
the strong and the weak entity type.
Let us see an example of the relationship between the Strong entity type and weak entity type
with the help of ER Diagram:

ENTITY SETS
ENTITY SET :-

An entity set is a group of entities of the same entity type.


For example, an entity set of students, an entity set of motorbikes, an entity of smartphones,
an entity of customers, etc.
Entity sets can be classified into two types :-

1. Strong Entity Set:


In a DBMS, a strong entity set consists of a primary key.
For example, an entity of motorbikes with the attributes, motorbike's registration number,
motorbike's name, motorbike's model, and motorbike's colour.
Below is the representation of a strong entity set in tabular form:

ENTITY SETS
Example of Entity Relationship Diagram representation of the above strong entity set :-

ENTITY SETS
2. Weak Entity Set:
In a DBMS, a weak entity set does not contain a primary key.
For example, An entity of smartphones with its attributes, phone's name, phone's colour, and
phone's RAM.
Below is the representation of a weak entity set in tabular form:

CONCLUSION

In this article, you read all the vital things related to entities in DBMS.
• We have discussed that entity is anything that exists in the real world and is identifiable.
• We have discussed the types of entities, which are tangible entities and intangible
entities.
• We have discussed entity type and types of entity type, which are weak entity type and
strong entity type.
• We have discussed entity sets and types of entity sets, which are weak entity sets and
strong entity sets.

ENTITY SETS
KEY TAKEAWAYS
An entity is referred to as an object or thing that exists in the real world. For example,
customer, car, pen, etc.
There are 2 types of entities Tangible and Intangible entities.
There are 2 types of entities set Strong entity set and Weak entity set.

ENTITY SETS
ATTRIBUTES IN DBMS

ATTRIBUTES 81
SUB LESSON 4.3

ATTRIBUTES IN DBMS

DBMS :-

➢ DBMS stands for Database Management System, which is a tool or software used for the
creation, deletion, or manipulation of the database.
➢ Attributes
➢ In DBMS, we have entities, and each entity contains some property about their behaviour
which is also called the attribute. In relational databases, we have tables, and each column
contains some entity that has some attributes, so all the entries for that column should strictly
follow the attribute of the entity. Entities define the characteristic property of the attributes.

FOLLOWING ARE THE ATTRIBUTES OF AN ENTITY

• Simple Attribute :-
It is also known as atomic attributes. When an attribute cannot be divided further, then it is
called a simple attribute.
For example, in a student table, the branch attribute cannot be further divided. It is called a
simple or atomic attribute because it contains only a single value that cannot be broken further.

• Composite Attribute :-
Composite attributes are those that are made up of the composition of more than one
attribute. When any attribute can be divided further into more sub-attributes, then that
attribute is called a composite attribute.

ATTRIBUTES 82
For example, in a student table, we have attributes of student names that can be further
broken down into first name, middle name, and last name. So the student name will be a
composite attribute.
Another example from a personal detail table would be the attribute of address. The address
can be divided into a street, area, district, and state.

• Single-valued Attribute :-
Those attributes which can have exactly one value are known as single valued attributes. They
contain singular values, so more than one value is not allowed.
For example, the DOB of a student can be a single valued attribute. Another example is gender
because one person can have only one gender.

• Multi-valued Attribute :-
Those attributes which can have more than one entry or which contain more than one value
are called multi valued attributes.
In the Entity Relationship (ER) diagram, we represent the multi valued attribute by double oval
representation.
For example, one person can have more than one phone number, so that it would be a multi
valued attribute. Another example is the hobbies of a person because one can have more than
one hobby.

• Derived Attribute :-
Derived attributes are also called stored attributes. When one attribute can be derived from
the other attribute, then it is called a derived attribute. We can do some calculations on normal
attributes and create derived attributes.
For example, the age of a student can be a derived attribute because we can get it by the DOB
of the student.
Another example can be of working experience, which can be obtained by the date of joining of
an employee.

ATTRIBUTES 83
In the ER diagram, we represent the derived attributes by a dotted oval shape.

• Complex Attribute :-
If any attribute has the combining property of multi values and composite attributes, then it is
called a complex attribute. It means if one attribute is made up of more than one attribute and
each attribute can have more than one value, then it is called a complex attribute.
For example, if a person has more than one office and each office has an address made from a
street number and city. So the address is a composite attribute, and offices are multi valued
attributes, So combing them is called complex attributes.

• Key Attribute :-
Those attributes which can be identified uniquely in the relational table are called key
attributes.
For example, a student is a unique attribute.
We can understand the attributes by the following example:

ATTRIBUTES 84
In the above example, we have an ER diagram of a table named Employee. We have a lot of
attributes from the above table.
• Department is a single valued attribute that can have only one value.
• Name is a composite attribute because it is made up of a first name and the last name
as the middle name attribute.
• Work Experience attribute is a derived attribute, and it is represented by a dotted oval.
We can get the work experience by the other attribute date of joining.
• Phone number is a multi-valued attribute because one employee can have more than
one phone number, which is represented by a double oval representation.

KEY TAKEAWAYS:-
Entities define the characteristic property of the attributes.
Simple Attribute :-
It is also known as atomic attributes. When an attribute cannot be divided further, then it is
called a simple attribute.
Multi-valued Attribute :-
Those attributes which can have more than one entry or which contain more than one value
are called multi valued attributes.
Derived Attribute :-
Derived attributes are also called stored attributes. When one attribute can be derived from
the other attribute, then it is called a derived attribute. We can do some calculations on normal
attributes and create derived attributes.
Complex Attribute :-
If any attribute has the combining property of multi values and composite attributes, then it is
called a complex attribute. It means if one attribute is made up of more than one attribute and
each attribute can have more than one value, then it is called a complex attribute.
Key Attribute :-
Those attributes which can be identified uniquely in the relational table are called key attributes.

ATTRIBUTES 85
RELATIONAL STRUCTURE

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


SUB LESSON 4.4

TYPES OF RELATIONSHIP IN DBMS


➢ A relational database collects different types of data sets that use tables, records, and columns.
It is used to create a well-defined relationship between database tables so that relational
databases can be easily stored. For example of relational databases such as Microsoft SQL
Server, Oracle Database, MYSQL, etc.
There are some important parameters of the relational database :-
➢ It is based on a relational model (Data in tables).
➢ Each row in the table with a unique id, key.
➢ Columns of the table hold attributes of data.

Employee table (Relation / Table Name)

EmpID EmpName EmpAge CountryName

Emp 101 Andrew Mathew 24 USA

Emp 102 Marcus dugles 27 England

Emp 103 Engidi Nathem 28 France

Emp 104 Jason Quilt 21 Japan

Emp 108 Robert 29 Italy


Following are the different types of relational database tables.
1. One to One relationship
2. One to many or many to one relationship
3. Many to many relationships
➢ One to One Relationship (1:1): It is used to create a relationship between two tables in which a
single row of the first table can only be related to one and only one records of a second table.
Similarly, the row of a second table can also be related to anyone row of the first table.
➢ Following is the example to show a relational database, as shown below.

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


➢ One to Many Relationship: It is used to create a relationship between two tables. Any
single rows of the first table can be related to one or more rows of the second tables, but
the rows of second tables can only relate to the only row in the first table. It is also known
as a many to one relationship.
➢ Representation of One to Many relational databases

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


Representation of many to one relational database

Relational Database Database

A relational database can store and arrange the It is used to store the data as files.
data in the tabular form like rows and columns.

The data normalization feature is available in It does not have normalization.


the relational database.

It supports a distributed database. It does not support the distributed database.

In a relational database, the values are stored Generally, it stores the data in the hierarchical
as tables that require primary keys to possess or navigational form.
the data in a database.

It is designed to handle a huge collection of It is designed to handle the small collection of


data and multiple users. data files that requires a single user.

A relational database uses integrity constraints It does not follow any integrity constraints
rules that are defined in ACID properties. rule nor utilize any security to protect the data
from manipulation.

➢ Many to Many Relationship: It is many to many relationships that create a relationship


between two tables. Each record of the first table can relate to any records (or no records)
in the second table. Similarly, each record of the second table can also relate to more than
one record of the first table. It is also represented an N:N relationship.

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


➢ For example, there are many people involved in each project, and every person can involve
more than one project.

DIFFERENCE BETWEEN A DATABASE AND A RELATIONAL DATABASE :-

Advantages of relational databases :-


1. Simple Model: The simplest model of the relational database does not require any
complex structure or query to process the databases. It has a simple architectural process as
compared to a hierarchical database structure. Its simple architecture can be handled with
simple SQL queries to access and design the relational database.
2. Data Accuracy: Relational databases can have multiples tables related to each other
through primary and foreign keys. There are fewer chances for duplication of data fields.
Therefore the accuracy of data in relational database tables is greater than in any other
database system.
3. Easy to access Data: The data can be easily accessed from the relational database, and it
does not follow any pattern or way to access the data. One can access any data from a
database table using SQL queries. Each table in the associated database is joined through any
relational queries such as join and conditional descriptions to concatenate all tables to get the
required data.
4. Security: It sets a limit that allows specific users to use relational data in RDBMS.
5. Collaborate: It allows multiple users to access the same database at a time.

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


KEY TAKEAWAYS
Various types of relationships that are possible between tables are 1:1, 1:M, M:1, M:M.
With various types of relationships normalization in RDBMS is possible.
Data Accuracy, Normalization, Ease Access of Data, Security and Collaborate are some of the
features which can available with relationships in RDBMS.

RELATIONSHIPS AND TYPES OF RELATIONSHIPS


KEYS

TYPES OF KEYS
SUB LESSON 5.1
TYPES OF KEYS

➢ Keys play an important role in the relational database.


➢ It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.
For example, ID is used as a key in the Student table because it is unique for each student. In
the PERSON table, passport_number, license_number, SSN are keys since they are unique for
each person.

TYPES OF KEYS :-

1. PRIMARY KEY :-

o It is the first key used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys, as we saw in the PERSON table. The key which is most suitable
from those lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each
employee. In the EMPLOYEE table, we can even select License_Number and Passport_Number
as primary keys since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.
o

2. CANDIDATE KEY :-

o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key.
The candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the
attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key.

TYPES OF KEYS
3. SUPER KEY :-

Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a
candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of
two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.

4. FOREIGN KEY :-

o Foreign keys are the column of the table used to point to the primary key of another
table.
o Every employee works in a specific department in a company, and employee and
department are two different entities. So we can't store the department's information in the
employee table. That's why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new
attribute in the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.

5. ALTERNATE KEY :-

There may be one or more attributes or a combination of attributes that uniquely identify each
tuple in a relation. These attributes or combinations of the attributes are called the candidate
keys. One key is chosen as the primary key from these candidate keys, and the remaining
candidate key, if it exists, is termed the alternate key. In other words, the total number of the
alternate keys is the total number of candidate keys minus the primary key. The alternate key
may or may not exist. If there is only one candidate key in a relation, it does not have an
alternate key.

TYPES OF KEYS
For example, employee relation has two attributes, Employee_Id and PAN_No, that act as
candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other
candidate key, PAN_No, acts as the Alternate key.

6. COMPOSITE KEY :-

Whenever a primary key consists of more than one attribute, it is known as a composite key.
This key is also known as Concatenated Key.
For example, in employee relations, we assume that an employee may be assigned multiple
roles, and an employee may work on multiple projects simultaneously. So the primary key will
be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So
these attributes act as a composite key since the primary key comprises more than one
attribute.

7. ARTIFICIAL KEY :-

The key created using arbitrarily assigned data are known as artificial keys. These keys are
created when a primary key is large and complex and has no relationship with many other
relations. The data values of the artificial keys are usually numbered in a serial order.
For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID, is large in
employee relations. So it would be better to add a new virtual attribute to identify each tuple in
the relation uniquely.

DIFFERENCE BETWEEN PRIMARY KEY AND FOREIGN KEY :-

S.NO. PRIMARY KEY FOREIGN KEY

A primary key is used to ensure A foreign key is a column or group of columns in a


1 data in the specific column is relational database table that provides a link

TYPES OF KEYS
S.NO. PRIMARY KEY FOREIGN KEY

unique. between data in two tables.

It uniquely identifies a record in It refers to the field in a table which is the primary
2 the relational database table. key of another table.

Only one primary key is allowed Whereas more than one foreign key are allowed in
3 in a table. a table.

It is a combination of UNIQUE It can contain duplicate values and a table in a


4 and Not Null constraints. relational database.

5 It does not allow NULL values. It can also contain NULL values.

Its value cannot be deleted from


6 the parent table. Its value can be deleted from the child table.

It constraint can be implicitly It constraint cannot be defined on the local or


7 defined on the temporary tables. global temporary tables.

KEY TAKEAWAYS:-
Two types of Keys are there basically that are Primary Key and Foreign Key. These keys are
normally used to establish the relationship between tables.
If more than one Primary Key is there on Table then it is called as Composite Primary Key.

TYPES OF KEYS
KEYS

SUPER KEY CANDIDATE KEY


SUB LESSON 5.2

SUPER KEY AND CANDIDATE KEY

WHAT IS SUPER KEY?

Super Key is the set of attributes that can uniquely identify a tuple is known as Super Key. For
Example, STUD_NO, (STUD_NO, STUD_NAME), etc.
Or A super key is a group of single or multiple keys that identifies rows in a table. It supports
NULL values. Example: SNO+PHONE is a super key.

Adding zero or more attributes to the candidate key generates the super key.
A candidate key is a super key but vice versa is not true.

Example:
EmpSSN EmpNum Empname

9812345098 AB05 Shown

9876512345 AB06 Roslyn

199937890 AB07 James

WHY WE NEED A SUPER KEY?

Here are some reasons for using key in the DBMS system.
Keys help you to identify any row of data in a table. In a real-world application, a table could
contain thousands of records. Moreover, the records could be duplicated. Keys in RDBMS
ensure that you can uniquely identify a table record despite these challenges.
Allows you to establish a relationship between and identify the relation between tables
Help you to enforce identity and integrity in the relationship.

SUPER KEY CANDIDATE KEY


WHAT IS CANDIDATE KEY?

Candidate Key in SQL is a set of attributes that uniquely identify tuples in a table. Candidate Key
is a super key with no repeated attributes. The Primary key should be selected from the
candidate keys. Every table must have at least a single candidate key. A table can have multiple
candidate keys but only a single primary key.

It is a minimal super key.


It is a super key with no repeated data is called a candidate key.
The minimal set of attributes that can uniquely identify a record.
It must contain unique values.
It can contain NULL values.
Every table must have at least a single candidate key.
A table can have multiple candidate keys but only one primary key (the primary key cannot
have a NULL value, so the candidate key with NULL value can’t be the primary key). For
example, suppose we have two tables, “

Eg: Studid, Roll No, and email are candidate keys.

The value of the Candidate Key is unique and may be null for tuple.
There can be more than one candidate key in a relation. For Example, STUD_NO is the
candidate key for relation STUDENT.
The candidate key can be simple (having only one attribute) or composite as well. For Example,
{STUD_NO, COURSE_NO} is a composite candidate key for relation STUDENT_COURSE.
No, of candidate keys in a Relation are nC(floor(n/2)),for example if a Relation have 5 attributes
i.e. R(A,B,C,D,E) then total no of candidate keys are 5C(floor(5/2))=10.

SUPER KEY CANDIDATE KEY


DIFFERENCE BETWEEN SUPER KEY AND CANDIDATE KEY :-

Super Key Candidate Key

Super Key is an attribute (or set of attributes) that


is used to uniquely identifies all attributes in a
relation. Candidate Key is a subset of a super key.

All super keys can’t be candidate keys. But all candidate keys are super keys.

Various super keys together makes the criteria to Various candidate keys together makes
select the candidate keys. the criteria to select the primary keys.

In a relation, number of super keys is more than While in a relation, number of candidate
number of candidate keys. keys are less than number of super keys.

Candidate key attributes can also


Super key attributes can contain NULL values. contain NULL values.

KEY TAKEAWAYS:-
Super Key – A super key is a group of single or multiple keys which identifies rows in a table.

Candidate Key – is a set of attributes that uniquely identify tuples in a table. Candidate Key is a
super key with no repeated attributes.

SUPER KEY CANDIDATE KEY 1


KEYS

TYPES OF KEYS 1
SUB LESSON 5.3

PRIMARY KEY AND FOREIGN KEY

PRIMARY KEY :-

➢ A primary key and a foreign key are two important concepts in database management systems
(DBMS) that help to enforce the integrity and relationships between tables in a relational
database.
➢ A primary key is a unique identifier for each row in a table. It is used to enforce the uniqueness
of the data in a table, and is used as a reference to link data in different tables together. A
primary key is a candidate key that has been selected as the most convenient and suitable
identifier for the data in the table. The primary key is used to enforce referential integrity in the
database, which ensures that relationships between tables are maintained even if data is
updated or deleted.
➢ For example, consider a database that stores information about employees and departments.
The Employee table has columns for Employee ID, First Name, Last Name, and Department ID.
The Department table has columns for Department ID, Department Name, and Manager ID.
➢ The Employee ID column in the Employee table can be selected as the primary key, as it is a
unique identifier for each employee in the table. The primary key is used to enforce the
uniqueness of the data in the Employee table, and is used as a reference to link data in the
Department table.

FOREIGN KEY :-

➢ A foreign key is a column in one table that is a reference to the primary key of another table.
The foreign key is used to enforce referential integrity in the database, and ensures that
relationships between tables are maintained even if data is updated or deleted.

TYPES OF KEYS 1
➢ In the example above, the Department ID column in the Employee table is a foreign key that
references the primary key of the Department table. The foreign key ensures that if a
department is deleted from the Department table, all employees associated with that
department will also be deleted from the Employee table. This helps to maintain the
relationships between the tables, even if data is updated or deleted.
➢ Foreign keys can also be used to enforce relationships between tables in a many-to-many
relationship. For example, consider a database that stores information about employees and
projects. The Employee table has columns for Employee ID, First Name, Last Name, and Project
ID. The Project table has columns for Project ID, Project Name, and Manager ID.
➢ In this example, the Employee table and the Project table have a many-to-many relationship, as
an employee can be associated with multiple projects, and a project can be associated with
multiple employees. To enforce this relationship, a junction table can be created that has
columns for Employee ID, Project ID, and Role.
➢ The Employee ID column in the junction table is a foreign key that references the primary key of
the Employee table, and the Project ID column is a foreign key that references the primary key
of the Project table. The foreign keys in the junction table ensure that if an employee or a
project is deleted, all associations between that employee and project will also be deleted.
➢ In summary, a primary key is a unique identifier for each row in a table, and is used to enforce
the uniqueness of the data in the table. A foreign key is a column in one table that references
the primary key of another table, and is used to enforce referential integrity and relationships
between tables in a database. Properly defining the primary key and foreign key in a database
design helps to enforce the integrity of the data and maintain relationships between tables,
even if data is updated or deleted.
➢ Here's an example of primary keys and foreign keys in tabular form:

DEPARTMENT TABLE :-
Department ID Department Name Manager ID
1 Sales 10
2 Marketing 20
3 IT 30

TYPES OF KEYS 1
EMPLOYEE TABLE :-
Employee ID First Name Last Name Department ID
10 John Doe 1
20 Jane Doe 2
30 Bob Smith 3

➢ In this example, the Department ID column in the Department table is the primary key, as it is a
unique identifier for each department in the table. The primary key is used to enforce the
uniqueness of the data in the Department table, and is used as a reference to link data in the
Employee table.
➢ The Department ID column in the Employee table is a foreign key that references the primary
key of the Department table. The foreign key ensures that if a department is deleted from the
Department table, all employees associated with that department will also be deleted from the
Employee table. This helps to maintain the relationships between the tables, even if data is
updated or deleted.
➢ In this example, the primary key of the Department table is referenced by the foreign key in the
Employee table. The primary key is used to enforce the uniqueness of the data in the
Department table, and the foreign key is used to enforce referential integrity and maintain
relationships between the tables.

DIFFERENCE BETWEEN PRIMARY KEY AND FOREIGN KEY :-

S.NO. PRIMARY KEY FOREIGN KEY

A primary key is used to ensure A foreign key is a column or group of columns in a


data in the specific column is relational database table that provides a link
1 unique. between data in two tables.

It uniquely identifies a record in It refers to the field in a table which is the primary
2 the relational database table. key of another table.

TYPES OF KEYS 1
S.NO. PRIMARY KEY FOREIGN KEY

Only one primary key is allowed Whereas more than one foreign key are allowed in
3 in a table. a table.

It is a combination of UNIQUE It can contain duplicate values and a table in a


4 and Not Null constraints. relational database.

5 It does not allow NULL values. It can also contain NULL values.

Its value cannot be deleted from


6 the parent table. Its value can be deleted from the child table.

It constraint can be implicitly It constraint cannot be defined on the local or


7 defined on the temporary tables. global temporary tables.

KEY TAKEAWAYS:-
Two types of Keys are there basically that are Primary Key and Foreign Key. These keys are
normally used to establish the relationship between tables.
If more than one Primary Key is there on Table then it is called as Composite Primary Key.

TYPES OF KEYS 1
INDEXING

INDEXING 106
SUB LESSON 6.1

INDEX

INDEXING IN DBMS:-

➢ Indexing is used to optimize the performance of a database by minimizing the number


of disk accesses required when a query is processed.
➢ The index is a type of data structure. It is used to locate and access the data in a
database table quickly.

INDEX STRUCTURE:-

Indexes can be created using some database columns.

➢ The first column of the database is the search key that contains a copy of the primary
key or candidate key of the table. The values of the primary key are stored in sorted order so
that the corresponding data can be accessed easily.
➢ The second column of the database is the data reference. It contains a set of pointers
holding the address of the disk block where the value of the particular key can be found.

INDEXING METHODS :-

INDEXING 107
ORDERED INDICES :-

The indices are usually sorted to make searching faster. The indices which are sorted are known
as ordered indices.
Example: Suppose we have an employee table with thousands of record and each of which is
10 bytes long. If their IDs start with 1, 2, 3....and so on and we have to search student with ID-
543.
➢ In the case of a database with no index, we have to search the disk block from starting
till it reaches 543. The DBMS will read the record after reading 543*10=5430 bytes.
➢ In the case of an index, we will search using indexes and the DBMS will read the record
after reading 542*2= 1084 bytes which are very less compared to the previous case.

PRIMARY INDEX :-

➢ If the index is created on the basis of the primary key of the table, then it is known as
primary indexing. These primary keys are unique to each record and contain 1:1 relation
between the records.
➢ As primary keys are stored in sorted order, the performance of the searching operation
is quite efficient.

INDEXING 108
➢ The primary index can be classified into two types: Dense index and sparse index.

DENSE INDEX :-

➢ The dense index contains an index record for every search key value in the data file. It
makes searching faster.
➢ In this, the number of records in the index table is same as the number of records in the
main table.
➢ It needs more space to store index record itself. The index records have the search key
and a pointer to the actual record on the disk.

SPARSE INDEX :-

➢ In the data file, index record appears only for a few items. Each item points to a block.
➢ In this, instead of pointing to each record in the main table, the index points to the
records in the main table in a gap.

CLUSTERING INDEX :-

➢ A clustered index can be defined as an ordered data file. Sometimes the index is created
on non-primary key columns which may not be unique for each record.
➢ In this case, to identify the record faster, we will group two or more columns to get the
unique value and create index out of them. This method is called a clustering index.

INDEXING 109
➢ The records which have similar characteristics are grouped, and indexes are created for
these group.
Example: suppose a company contains several employees in each department. Suppose we use
a clustering index, where all employees which belong to the same Dept_ID are considered
within a single cluster, and index pointers point to the cluster as a whole. Here Dept_Id is a non-
unique key.

➢ The previous schema is little confusing because one disk block is shared by records which
belong to the different cluster. If we use separate disk block for separate clusters, then it is
called better technique.

INDEXING 110
SECONDARY INDEX :-

➢ In the sparse indexing, as the size of the table grows, the size of mapping also grows. These
mappings are usually kept in the primary memory so that address fetch should be faster. Then
the secondary memory searches the actual data based on the address got from mapping. If the
mapping size grows then fetching the address itself becomes slower. In this case, the sparse
index will not be efficient. To overcome this problem, secondary indexing is introduced.
➢ In secondary indexing, to reduce the size of mapping, another level of indexing is introduced. In
this method, the huge range for the columns is selected initially so that the mapping size of the
first level becomes small. Then each range is further divided into smaller ranges. The mapping
of the first level is stored in the primary memory, so that address fetch is faster. The mapping of
the second level and actual data are stored in the secondary memory (hard disk).

INDEXING 111
For example:
➢ If you want to find the record of roll 111 in the diagram, then it will search the highest
entry which is smaller than or equal to 111 in the first level index. It will get 100 at this level.
➢ Then in the second index level, again it does max (111) <= 111 and gets 110. Now using
the address 110, it goes to the data block and starts searching each record till it gets 111.
➢ This is how a search is performed in this method. Inserting, updating or deleting is also
done in the same manner.

KEY TAKEAWAYS:-
➢ Indexing is used to optimize the performance of a database by minimizing the number
of disk accesses required when a query is processed.
➢ Various types of indexing methods are Ordered Indices, Primary Index, Clustering Index
and Secondary Index.

INDEXING 112
DATABASE DESIGN

FUNDAMENTALS OF MARKETING MANAGEMENT 113


SUB LESSON 7.1
DATABASE DESIGN

INTRODUCTION:-

➢ We might have come across the word "Database" quite often. This term carries a high
emphasis on its arms. More often, it is not just related to the developer's perspective but is
quite often used with non-tech groups or communities. Technically, a database is more of a
storage term used to denote the relationship with different forms of data that are coagulated in
a single place. Thus, we can define a database as an organized collection of data, generally
stored and accessed electronically through computer systems. This article is highly centric to
the database design and its association with citable terms and methodologies was commonly
taken into account. We'll be discussing those terms concerning database design to understand
the bits and pieces. Let's talk about it straight away.

WHAT IS DATABASE DESIGN ?

➢ Database design can be generally defined as a collection of tasks or processes that enhance the
designing, development, implementation, and maintenance of enterprise data management
system. Designing a proper database reduces the maintenance cost thereby improving data
consistency and the cost-effective measures are greatly influenced in terms of disk storage
space. Therefore, there has to be a brilliant concept of designing a database. The designer
should follow the constraints and decide how the elements correlate and what kind of data
must be stored.
➢ The main objectives behind database designing are to produce physical and logical design
models of the proposed database system. To elaborate this, the logical model is primarily
concentrated on the requirements of data and the considerations must be made in terms of
monolithic considerations and hence the stored physical data must be stored independent of
the physical conditions. On the other hand, the physical database design model includes a
translation of the logical design model of the database by keep control of physical media using
hardware resources and software systems such as Database Management System (DBMS).

FUNDAMENTALS OF MARKETING MANAGEMENT 114


WHY IS DATABASE DESIGN IMPORTANT ?

➢ The important consideration that can be taken into account while emphasizing the importance
of database design can be explained in terms of the following points given below.
1. Database designs provide the blueprints of how the data is going to be stored in a
system. A proper design of a database highly affects the overall performance of any application.
2. The designing principles defined for a database give a clear idea of the behavior of any
application and how the requests are processed.
3. Another instance to emphasize the database design is that a proper database design
meets all the requirements of users.
4. Lastly, the processing time of an application is greatly reduced if the constraints of
designing a highly efficient database are properly implemented.D

LIFE CYCLE :-

➢ Although, the life cycle of a database is not an important discussion that has to be taken
forward in this article because we are focused on the database design. But, before jumping
directly on the designing models constituting database design it is important to understand the
overall workflow and life-cycle of the database.

FUNDAMENTALS OF MARKETING MANAGEMENT 115


REQUIREMENT ANALYSIS :-

➢ First of all, the planning has to be done on what are the basic requirements of the project under
which the design of the database has to be taken forward. Thus, they can be defined as:-
➢ Planning - This stage is concerned with planning the entire DDLC (Database Development Life
Cycle). The strategic considerations are taken into account before proceeding.
➢ System definition - This stage covers the boundaries and scopes of the proper database after
planning.

DATABASE DESIGNING :-

➢ The next step involves designing the database considering the user-based requirements and
splitting them out into various models so that load or heavy dependencies on a single aspect
are not imposed. Therefore, there has been some model-centric approach and that's where
logical and physical models play a crucial role.
➢ Physical Model - The physical model is concerned with the practices and implementations of
the logical model.
➢ Logical Model - This stage is primarily concerned with developing a model based on the
proposed requirements. The entire model is designed on paper without any implementation or
adopting DBMS considerations.

IMPLEMENTATION :-

➢ The last step covers the implementation methods and checking out the behavior that matches
our requirements. It is ensured with continuous integration testing of the database with
different data sets and conversion of data into machine understandable language. The
manipulation of data is primarily focused on these steps where queries are made to run and
check if the application is designed satisfactorily or not.

FUNDAMENTALS OF MARKETING MANAGEMENT 116


➢ Data conversion and loading - This section is used to import and convert data from the old to
the new system.
➢ Testing - This stage is concerned with error identification in the newly implemented system.
Testing is a crucial step because it checks the database directly and compares the requirement
specifications.

DATABASE DESIGN PROCESS :-

The process of designing a database carries various conceptual approaches that are needed to
be kept in mind. An ideal and well-structured database design must be able to:
1. Save disk space by eliminating redundant data.
2. Maintains data integrity and accuracy.
3. Provides data access in useful ways.
4. Comparing Logical and Physical data models.

KEY TAKEAWAYS :-
Database design is a method of identifying the gaps and opportunities of designing a proper
utilization method. It is the main component of a system that gives a blueprint of the data and
its behaviour inside the system.
A proper database design is always kept on priority due to the user requirements being kept
excessively high and following up with the constraint practices of designing a database might
only stand as a chance to gain the requested efficiency.
Moreover, we also learned separately about the different design models that portray the ideal
database design along with the limitless discussion on their properties and how to make use of
them. Furthermore, we learned how the life-cycle of a database decides the design of the
database and how to put the concept of design into the life-cycle methods so that efficient and
highly sophisticated databases can be designed based on the user requirements.

FUNDAMENTALS OF MARKETING MANAGEMENT 117


DATABASE DESIGN

FUNDAMENTALS OF MARKETING MANAGEMENT 118


SUB LESSON 7.2

FUNCTIONAL DEPENDENCIES

In a relational database, functional dependencies are constraints that define the relationships
between attributes (columns) in a relation (table). These dependencies are a fundamental
concept in database management systems (DBMS) and are used to maintain the integrity and
consistency of the data in a relational database. Functional dependencies are often denoted
using an arrow symbol (→).
A functional dependency is defined as follows: Given a relation R, a set of attributes X in R
functionally determines a set of attributes Y in R if, for every possible combination of values of
X, there is a unique combination of values in Y. In mathematical notation, this is represented as
X → Y.
Here are some key points about functional dependencies:
➢ Functional Dependency Notation :- X → Y means that for every possible combination of values
in X, there is a unique combination of values in Y. This indicates that Y is functionally dependent
on X.
➢ Full Functional Dependency :- A functional dependency X → Y is considered full if removing any
attribute from X would break the dependency. In other words, no proper subset of X can
functionally determine Y.
➢ Partial Functional Dependency :- A functional dependency X → Y is considered partial if it's
possible to remove one or more attributes from X and still maintain the dependency. Partial
functional dependencies can lead to data anomalies and are typically discouraged in database
design.
➢ Transitive Functional Dependency :- If X → Y and Y → Z, then it implies that X → Z. This is called
a transitive functional dependency.
➢ Multivalued Dependency :- A special case of functional dependency is multivalued
dependency, which deals with attributes that have multiple values associated with them.

FUNDAMENTALS OF MARKETING MANAGEMENT 119


Functional dependencies are crucial for database normalization, which is the process of
organizing the data in a relational database to minimize data redundancy and improve data
integrity. The normal forms (1NF, 2NF, 3NF, BCNF, etc.) are used to check and enforce various
types of functional dependencies in the database design. Normalization ensures that data is
organized in such a way that updates and queries can be performed efficiently while avoiding
data anomalies and inconsistencies.
By understanding and specifying functional dependencies, database designers can create tables
and relationships that are optimized for data storage and retrieval, leading to a well-structured
and efficient database schema.

KEY TAKEAWAYS:-

A functional dependency is defined as follows: Given a relation R, a set of attributes X in R


functionally determines a set of attributes Y in R if, for every possible combination of values of
X, there is a unique combination of values in Y. In mathematical notation, this is represented as
X → Y.

FUNDAMENTALS OF MARKETING MANAGEMENT 120


DATABASE DESIGN

FUNDAMENTALS OF MARKETING MANAGEMENT 121


SUB LESSON 7.3

NORMALIZATION

Normalization is a database design technique used in Database Management Systems (DBMS)


to structure relational databases in a way that reduces data redundancy and improves data
integrity. The primary goal of normalization is to minimize data anomalies (such as update
anomalies, insert anomalies, and delete anomalies) and ensure that data is stored efficiently,
without unnecessary duplication.
Normalization is typically carried out through a series of normal forms, each of which
represents a specific set of rules for organizing data in a relational database. The most
commonly used normal forms are First Normal Form (1NF), Second Normal Form (2NF), Third
Normal Form (3NF), Boyce-Codd Normal Form (BCNF), and Fourth Normal Form (4NF), among
others.
Here's an overview of these normal forms:
➢ First Normal Form (1NF) :- In 1NF, a table is organized so that it contains only atomic
(indivisible) values in each cell, and there is no repeating groups or arrays of data. Each
attribute in a row must have a single, unambiguous value.
➢ Second Normal Form (2NF) :- 2NF builds upon 1NF and involves eliminating partial
dependencies. To achieve 2NF, a table must have a primary key, and all non-key attributes must
be functionally dependent on the entire primary key. This helps avoid redundancy.
➢ Third Normal Form (3NF) :- 3NF extends 2NF by eliminating transitive dependencies. A table is
in 3NF if all non-key attributes are functionally dependent only on the primary key and not on
other non-key attributes. This further reduces data anomalies and improves data integrity.
➢ Boyce-Codd Normal Form (BCNF) :- BCNF is a stricter version of 3NF, specifically designed to
address situations where multiple candidate keys exist in a table. It ensures that for every non-
trivial functional dependency, the left-hand side is a superkey. BCNF eliminates certain types of
redundancy that might exist in 3NF tables.

FUNDAMENTALS OF MARKETING MANAGEMENT 122


➢ Fourth Normal Form (4NF) :- 4NF is used to handle multi-valued dependencies. A table is in
4NF if it is in BCNF and has no non-trivial multi-valued dependencies. This helps prevent
anomalies related to multi-valued attributes.
Normalization helps to reduce data redundancy, as each piece of data is stored in only one
place, which in turn minimizes the chances of data inconsistencies. It also improves the
efficiency of data retrieval and supports more flexible and efficient database queries. However,
it's essential to strike a balance between normalization and performance, as over-normalization
can lead to complex queries and potentially degrade performance.
The choice of normal forms to apply during database design depends on the specific
requirements of the application and the trade-offs between redundancy elimination and query
performance. In some cases, designers may opt for demoralization, which introduces some
controlled redundancy to improve query performance while maintaining data integrity.

KEY TAKEAWAYS :-

Normalization helps to reduce data redundancy, as each piece of data is stored in only one
place, which in turn minimizes the chances of data inconsistencies.

The choice of normal forms to apply during database design depends on the specific
requirements of the application and the trade-offs between redundancy elimination and query
performance.

FUNDAMENTALS OF MARKETING MANAGEMENT 123


NORMALIZATION

NORMAL FORMS 124


SUB LESSON 8.1

NORMAL FORM

TYPES OF NORMAL FORMS :-

In database management systems (DBMS), normal forms are a series of guidelines that help
to ensure that the design of a database is efficient, organized, and free from data anomalies.
There are several levels of normalization, each with its own set of guidelines, known as
normal forms.

IMPORTANT POINTS REGARDING NORMAL FORMS IN DBMS :-

➢ First Normal Form (1NF): This is the most basic level of normalization. In 1NF, each
table cell should contain only a single value, and each column should have a unique name.
The first normal form helps to eliminate duplicate data and simplify queries.
➢ Second Normal Form (2NF): 2NF eliminates redundant data by requiring that each
non-key attribute be dependent on the primary key. This means that each column should be
directly related to the primary key, and not to other columns.
➢ Third Normal Form (3NF): 3NF builds on 2NF by requiring that all non-key attributes
are independent of each other. This means that each column should be directly related to the
primary key, and not to any other columns in the same table.
➢ Boyce-Codd Normal Form (BCNF): BCNF is a stricter form of 3NF that ensures that
each determinant in a table is a candidate key. In other words, BCNF ensures that each non-
key attribute is dependent only on the candidate key.
➢ Fourth Normal Form (4NF): 4NF is a further refinement of BCNF that ensures that a
table does not contain any multi-valued dependencies.

NORMAL FORMS 125


➢ Fifth Normal Form (5NF): 5NF is the highest level of normalization and involves
decomposing a table into smaller tables to remove data redundancy and improve data
integrity.
Normal forms help to reduce data redundancy, increase data consistency, and improve
database performance. However, higher levels of normalization can lead to more complex
database designs and queries. It is important to strike a balance between normalization and
practicality when designing a database.

ADVANTAGES OF NORMAL FORM :-

➢ Reduced data redundancy: Normalization helps to eliminate duplicate data in tables,


reducing the amount of storage space needed and improving database efficiency.
➢ Improved data consistency: Normalization ensures that data is stored in a consistent
and organized manner, reducing the risk of data inconsistencies and errors.
➢ Simplified database design: Normalization provides guidelines for organizing tables
and data relationships, making it easier to design and maintain a database.
➢ Improved query performance: Normalized tables are typically easier to search and
retrieve data from, resulting in faster query performance.
➢ Easier database maintenance: Normalization reduces the complexity of a database by
breaking it down into smaller, more manageable tables, making it easier to add, modify, and
delete data.
Overall, using normal forms in DBMS helps to improve data quality, increase database
efficiency, and simplify database design and maintenance.

KEY TAKEAWAYS
➢ BCNF is free from redundancy.
➢ If a relation is in BCNF, then 3NF is also satisfied.
➢ If all attributes of relation are prime attribute, then the relation is always in 3NF.

NORMAL FORMS 126


➢ A relation in a Relational Database is always and at least in 1NF form.
➢ Every Binary Relation (a Relation with only 2 attributes) is always in BCNF.
➢ If a Relation has only singleton candidate keys (i.e. every candidate key consists of only
1 attribute), then the Relation is always in 2NF (because no Partial functional dependency
possible).
➢ Sometimes going for BCNF form may not preserve functional dependency. In that case
go for BCNF only if the lost FD(s) is not required, else normalize till 3NF only.
➢ There are many more Normal forms that exist after BCNF, like 4NF and more. But in
real world database systems it’s generally not required to go beyond BCNF.

NORMAL FORMS 127


NORMALIZATION

FUNDAMENTALS OF MARKETING MANAGEMENT 1


SUB LESSON 8.2

1NF 2NF 3NF


FIRST NORMAL FORM (1NF):-

➢ A relation will be 1NF if it contains an atomic value.


➢ It states that an attribute of a table cannot hold multiple values. It must hold only single-
valued attribute.
➢ First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.

EMPLOYEE TABLE :-

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238

20 Harry 8574783832 Bihar

12 Sam 7390372389, Punjab


8589830302

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

FUNDAMENTALS OF MARKETING MANAGEMENT 1


The decomposition of the EMPLOYEE table into 1NF has been shown below :-

Example:

Hence, we require both Full Name and Address to identify a record uniquely. That is a
composite key.

SECOND NORMAL FORM (2NF) :-

➢ In the 2NF, relational must be in 1NF.


➢ In the second normal form, all non-key attributes are fully functional dependent on the
primary key

Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.

FUNDAMENTALS OF MARKETING MANAGEMENT 1


TEACHER TABLE :-

TEACHER_ID SUBJECT TEACHER_AGE

25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38

83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a


proper subset of a candidate key. That's why it violates the rule for 2NF.To convert the given
table into 2NF, we decompose it into two tables:

TEACHER_DETAIL TABLE :-

TEACHER_ID TEACHER_AGE

25 30

47 35

83 38

TEACHER_SUBJECT TABLE :-

TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

FUNDAMENTALS OF MARKETING MANAGEMENT 1


83 Math

83 Computer

It is clear that we can’t move forward to make our simple database in 2 nd Normalization form
unless we partition the table above.

We have divided our 1NF table into two tables viz. Table 1 and Table2. Table 1 contains
member information. Table 2 contains information on movies rented.
We have introduced a new column called Membership_id which is the primary key for table 1.
Records can be uniquely identified in Table 1 using membership id

Foreign Key references the primary key of another Table! It helps connect your Tables

FUNDAMENTALS OF MARKETING MANAGEMENT 1


THIRD NORMAL FORM (3NF) :-

➢ A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.
➢ 3NF is used to reduce the data duplication. It is also used to achieve the data integrity.
➢ If there is no transitive dependency for non-prime attributes, then the relation must be
in third normal form.
A relation is in third normal form if it holds atleast one of the following conditions for every
non-trivial function dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:

FUNDAMENTALS OF MARKETING MANAGEMENT 1


EMPLOYEE_DETAIL table:

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida

333 Stephan 02228 US Boston

444 Lan 60007 US Chicago

555 Katharine 06389 UK Norwich

666 John 462007 MP Bhopal

Super key in the table above:


1. {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
Candidate key: {EMP_ID}
Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime.
Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID.
The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super
key(EMP_ID). It violates the rule of third normal form.
That's why we need to move the EMP_CITY and EMP_STATE to the new <EMPLOYEE_ZIP> table,
with EMP_ZIP as a Primary key.

EMPLOYEE TABLE :-

EMP_ID EMP_NAME EMP_ZIP

222 Harry 201010

333 Stephan 02228

FUNDAMENTALS OF MARKETING MANAGEMENT 1


444 Lan 60007

555 Katharine 06389

666 John 462007

EMPLOYEE_ZIP table:

EMP_ZIP EMP_STATE EMP_CITY

201010 UP Noida

02228 US Boston

60007 US Chicago

06389 UK Norwich

462007 MP Bhopal

KEY TAKEAWAYS:-

APPLICATIONS OF NORMAL FORMS IN DBMS :-

• Data consistency: Normal forms ensure that data is consistent and does not contain
any redundant information. This helps to prevent inconsistencies and errors in the database.
• Data redundancy: Normal forms minimize data redundancy by organizing data into
tables that contain only unique data. This reduces the amount of storage space required for
the database and makes it easier to manage.

FUNDAMENTALS OF MARKETING MANAGEMENT 1


• Query performance: Normal forms can improve query performance by reducing the
number of joins required to retrieve data. This helps to speed up query processing and
improve overall system performance.
• Database maintenance: Normal forms make it easier to maintain the database by
reducing the amount of redundant data that needs to be updated, deleted, or modified. This
helps to improve database management and reduce the risk of errors or inconsistencies.
• Database design: Normal forms provide guidelines for designing databases that are
efficient, flexible, and scalable. This helps to ensure that the database can be easily modified,
updated, or expanded as needed.

FUNDAMENTALS OF MARKETING MANAGEMENT 1


OVERVIEW OF SQL

OVERVIEW OF SQL 137


SUB LESSON 9.1

OVERVIEW OF SQL

SQL :-

➢ SQL stands for Structured Query Language. It is used for storing and managing data in
relational database management system (RDMS).
➢ It is a standard language for Relational Database System. It enables a user to create,
read, update and delete relational databases and tables.
➢ All the RDBMS like MySQL, Informix, Oracle, MS Access and SQL Server use SQL as their
standard database language.
➢ SQL allows users to query the database in a number of ways, using English-like
statements.

A BRIEF HISTORY OF SQL :-

➢ 1970 − Dr. Edgar F. "Ted" Codd of IBM is known as the father of relational databases. He
described a relational model for databases.
➢ 1974 − Structured Query Language (SQL) appeared.
➢ 1978 − IBM worked to develop Codd's ideas and released a product named System/R.
➢ 1986 − IBM developed the first prototype of relational database and standardized by
ANSI. The first relational database was released by Relational Software which later came to be
known as Oracle.
➢ 1987 − SQL became the part of the International Organization for Standardization (ISO).

OVERVIEW OF SQL 138


RULES :-

➢ SQL follows the following rules:


➢ Structure query language is not case sensitive. Generally, keywords of SQL are written in
uppercase.
➢ Statements of SQL are dependent on text lines. We can use a single SQL statement on one or
multiple text line.
➢ Using the SQL statements, you can perform most of the actions in a database.
➢ SQL depends on tuple relational calculus and relational algebra.

WHY SQL ?

SQL is widely popular because it offers the following advantages −


➢ Allows users to access data in the relational database management systems.
➢ Allows users to describe the data.
➢ Allows users to define the data in a database and manipulate that data.
➢ Allows to embed within other languages using SQL modules, libraries & pre-compilers.
➢ Allows users to create and drop databases and tables.
➢ Allows users to create view, stored procedure, functions in a database.
➢ Allows users to set permissions on tables, procedures and views.

SQL PROCESS :-

➢ When an SQL command is executing for any RDBMS, then the system figure out the best
way to carry out the request and the SQL engine determines that how to interpret the task.
➢ In the process, various components are included. These components can be
optimization Engine, Query engine, Query dispatcher, classic, etc.
➢ All the non-SQL queries are handled by the classic query engine, but SQL query engine
won't handle logical files.

OVERVIEW OF SQL 139


CHARACTERISTICS OF SQL :-

➢ SQL is easy to learn.


➢ SQL is used to access data from relational database management systems.
➢ SQL can execute queries against the database.
➢ SQL is used to describe the data.
➢ SQL is used to define the data in the database and manipulate it when needed.
➢ SQL is used to create and drop the database and table.
➢ SQL is used to create a view, stored procedure, function in a database.
➢ SQL allows users to set permissions on tables, procedures, and views.

OVERVIEW OF SQL 140


ADVANTAGES OF SQL :-

There are the following advantages of SQL :-


➢ High speed :-
o Using the SQL queries, the user can quickly and efficiently retrieve a large amount of records
from a database.
➢ No coding needed :-
o In the standard SQL, it is very easy to manage the database system. It doesn't require a
substantial amount of code to manage the database system.
➢ Well defined standards :-
o Long established are used by the SQL databases that are being used by ISO and ANSI.
➢ Portability :-
o SQL can be used in laptop, PCs, server and even some mobile phones.
➢ Interactive language :-
o SQL is a domain language used to communicate with the database. It is also used to receive
answers to the complex questions in seconds.
➢ Multiple data view :-
o Using the SQL language, the users can make different views of the database structure.

KEY TAKEAWAYS :-
SQL stands for Structured Query Language. It is used for storing and managing data in relational
database management system (RDMS).
➢ SQL is easy to learn.
➢ SQL is used to access data from relational database management systems.
➢ SQL can execute queries against the database.
➢ SQL is used to describe the data.
➢ SQL is used to define the data in the database and manipulate it when needed.
➢ SQL is used to create and drop the database and table.
➢ SQL is used to create a view, stored procedure, function in a database.
➢ SQL allows users to set permissions on tables, procedures, and views.

OVERVIEW OF SQL 141


OVERVIEW OF SQL

OVERVIEW OF SQL 142


SUB LESSON 9.2

BASIC AND ADVANCED QUERIES IN SQL

BASIC SQL QUERIES :-

➢ SELECT: The SELECT statement is used to retrieve data from a database. It allows you to specify
the columns you want to retrieve and the table from which you want to retrieve data. For
example

o SELECT column1, column2 FROM table_name;

➢ WHERE: The WHERE clause is used to filter data based on a specific condition. It allows you to
specify a condition that must be met for a row to be included in the result set. For example:

o SELECT column1, column2 FROM table_name WHERE condition;

➢ INSERT: The INSERT statement is used to add new rows of data to a database table. It allows
you to specify the values for each column in the new row. For example:

o INSERT INTO table_name (column1, column2) VALUES (value1, value2);

➢ UPDATE: The UPDATE statement is used to modify existing rows of data in a database table. It
allows you to specify the new values for each column that you want to update, as well as a
condition that must be met for the update to occur. For example:

o UPDATE table_name SET column1 = new_value1, column2 = new_value2 WHERE condition;

OVERVIEW OF SQL 143


➢ DELETE: The DELETE statement is used to remove rows of data from a database table. It allows
you to specify a condition that must be met for a row to be deleted. For example:

o DELETE FROM table_name WHERE condition;

ADVANCE SQL QUERIES :-

➢ JOIN: The JOIN clause is used to combine data from two or more tables based on a related
column between them. There are different types of JOINs such as INNER JOIN, LEFT JOIN, RIGHT
JOIN, and FULL OUTER JOIN.
➢ INNER JOIN: An INNER JOIN returns only the rows that match the join condition in both tables.
It is the most commonly used type of join and is the default if no join type is specified.
➢ OUTER JOIN: An OUTER JOIN returns all rows from one table and any matching rows from the
other table. There are three types of outer joins: LEFT JOIN, RIGHT JOIN, and FULL JOIN. A LEFT
JOIN returns all rows from the left table and any matching rows from the right table. A RIGHT
JOIN returns all rows from the right table and any matching rows from the left table. A FULL
JOIN returns all rows from both tables, regardless of whether there is a match.
➢ CROSS JOIN: A CROSS JOIN returns the cartesian product of the two tables, which is the result
of all possible combinations of rows from both tables.
Example:

SELECT column1, column2 FROM table1 JOIN table2 ON table1.column_id =


table2.column_id;

➢ GROUP BY: The GROUP BY clause is used to group rows in a table based on the values in one or
more columns. It is often used in conjunction with aggregate functions such as COUNT, SUM,
AVG, MIN, MAX, etc.
Example:

OVERVIEW OF SQL 144


SELECT column1, COUNT(column2) FROM table_name GROUP BY column1;

ORDER BY: The ORDER BY clause is used to sort the result set based on the values in one or
more columns. It allows you to specify the sort order, either in ascending (ASC) or descending
(DESC) order.
Example:

SELECT column1, column2 FROM table_name ORDER BY column1 ASC;

➢ Subqueries: A subquery is a query that is embedded within another query.


o Here is an example of a simple subquery that retrieves the average salary of all employees in
the employees table:
o SELECT AVG(salary) FROM employees;
➢ It can be used in various parts of a SQL statement, such as in the SELECT, FROM, WHERE, or
HAVING clauses. Subqueries are enclosed in parentheses and can be used to retrieve
intermediate results that are then used in the outer query.
➢ Example:

o SELECT column1, column2 FROM table1 WHERE column1 IN (SELECT column1 FROM table2
WHERE condition);
➢ Here is an example of a simple subquery that retrieves the average salary of all employees in
the employees table:
o SELECT AVG(salary) FROM employees;

➢ UNION: A union is a SQL operation that combines the results of two or more SELECT statements
into a single result set. It allows you to retrieve data from multiple tables or queries and
combine it into a single result.The UNION operator is used to combine the result sets of two or
more SELECT statements into a single result set. The UNION operator removes duplicate rows
from the combined result set.

OVERVIEW OF SQL 145


➢ Example :-
o SELECT column1, column2 FROM table1
o UNION
o SELECT column1, column2 FROM table2;

KEY TAKEAWAYS:-
➢ Various types of basic queries that are there in SQL are Insert, Update, Delete and Select
➢ Various types of joins which are there to fetch records from one or more tables.
➢ SQL also provides various types of functions which can also be used to get appropriate data
from the table in an appropriate manner.
➢ Group By and Having clauses are also there which has it’s own significance.

OVERVIEW OF SQL 146


OVERVIEW OF SQL

OVERVIEW OF SQL 147


SUB LESSON 9.3

RELATIONAL ALGEBRA AND CALCULUS

➢ Relational database systems are expected to be equipped with a query language that can assist
its users to query the database instances. There are two kinds of query languages − relational
algebra and relational calculus.

RELATIONAL ALGEBRA :-

➢ Relational algebra is a procedural query language, which takes instances of relations as input
and yields instances of relations as output. It uses operators to perform queries. An operator
can be either unary or binary. They accept relations as their input and yield relations as their
output. Relational algebra is performed recursively on a relation and intermediate results are
also considered relations.
The fundamental operations of relational algebra are as follows :-
➢ Select
➢ Project
➢ Union
➢ Set different
➢ Cartesian product
➢ Rename
We will discuss all these operations in the following sections.
➢ Select Operation (σ)
o It selects tuples that satisfy the given predicate from a relation.
o Notation − σp(r)
o Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula
which may use connectors like and, or, and not. These terms may use relational operators like −
=, ≠, ≥, < , >, ≤.

OVERVIEW OF SQL 148


o For example :-
o σsubject = "database"(Books)
o Output − Selects tuples from books where subject is 'database'.
o σsubject = "database" and price = "450"(Books)
o Output − Selects tuples from books where subject is 'database' and 'price' is 450.
o σsubject = "database" and price = "450" or year > "2010"(Books)
o Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books
published after 2010.
➢ Project Operation (∏)
o It projects column(s) that satisfy a given predicate.
o Notation − ∏A1, A2, An (r)
o Where A1, A2 , An are attribute names of relation r.
o Duplicate rows are automatically eliminated, as relation is a set.
o For example −
o ∏subject, author (Books)
o Selects and projects columns named as subject and author from the relation Books.
➢ Union Operation (∪)
o It performs binary union between two given relations and is defined as −
o r ∪ s = { t | t ∈ r or t ∈ s}
o Notation − r U s
o Where r and s are either database relations or relation result set (temporary relation).
o For a union operation to be valid, the following conditions must hold −
o r, and s must have the same number of attributes.
o Attribute domains must be compatible.
o Duplicate tuples are automatically eliminated.
o ∏ author (Books) ∪ ∏ author (Articles)
o Output − Projects the names of the authors who have either written a book or an article or
both.
➢ Set Difference (−)

OVERVIEW OF SQL 149


o The result of set difference operation is tuples, which are present in one relation but are not in
the second relation.
o Notation − r − s
o Finds all the tuples that are present in r but not in s.
o ∏ author (Books) − ∏ author (Articles)
o Output − Provides the name of authors who have written books but not articles.
o Cartesian Product (Χ)
o Combines information of two different relations into one.
o Notation − r Χ s
o Where r and s are relations and their output will be defined as −
o r Χ s = { q t | q ∈ r and t ∈ s}
o σauthor = 'tutorialspoint'(Books Χ Articles)
o Output − Yields a relation, which shows all the books and articles written by tutorialspoint.
➢ Rename Operation (ρ)
o The results of relational algebra are also relations but without any name. The rename operation
allows us to rename the output relation. 'rename' operation is denoted with small Greek letter
rho ρ.
o Notation − ρ x (E)
o Where the result of expression E is saved with name of x.
o Additional operations are −
▪ Set intersection
▪ Assignment
▪ Natural join
▪ Relational Calculus
➢ In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that
is, it tells what to do but never explains how to do it.
➢ Relational calculus exists in two forms −
o Tuple Relational Calculus (TRC)
o Filtering variable ranges over tuples

OVERVIEW OF SQL 150


▪ Notation − {T | Condition}
o Returns all tuples T that satisfies a condition.
▪ For example −
▪ { T.name | Author(T) AND T.article = 'database' }
▪ Output − Returns tuples with 'name' from Author who has written article on 'database'.
➢ TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀).
▪ For example −
• { R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}
• Output − The above query will yield the same result as the previous one.
• Domain Relational Calculus (DRC)
• In DRC, the filtering variable uses the domain of attributes instead of entire tuple values (as
done in TRC, mentioned above).
➢ Notation −
o { a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
➢ Where a1, a2 are attributes and P stands for formulae built by inner attributes.
o For example −
o {< article, page, subject > | ∈ TutorialsPoint ∧ subject = 'database'}
o Output − Yields Article, Page, and Subject from the relation TutorialsPoint, where subject is
database.
➢ Just like TRC, DRC can also be written using existential and universal quantifiers. DRC also
involves relational operators.
➢ The expression power of Tuple Relation Calculus and Domain Relation Calculus is equivalent to
Relational Algebra.

KEY TAKEAWAYS:-
➢ Relational algebra is a procedural query language that operates on relations (tables) and
provides a set of operations to manipulate the data stored in relational databases.
➢ Understanding both relational algebra and calculus is crucial for anyone working with relational
databases, including database designers, administrators, and developers. They provide a
theoretical underpinning for practical query languages like SQL and aid in the efficient retrieval
and manipulation of data in relational database systems.

OVERVIEW OF SQL 151


PL/SQL

PL/SQL 152
SUB LESSON 10.1

BASIC CODE STRUCTURE

INTRODUCING PL/SQL BLOCK STRUCTURE AND ANONYMOUS BLOCK :-

PL/SQL program units organize the code into blocks. A block without a name is known as an
anonymous block. The anonymous block is the simplest unit in PL/SQL. It is called anonymous
block because it is not saved in the Oracle database.

An anonymous block is an only one-time use and useful in certain situations such as creating
test units. The following illustrates anonymous block syntax :-
[DECLARE]
Declaration statements;
BEGIN
Execution statements;
[EXCEPTION]
Exception handling statements;
END;
/

BLOCK STRUCTURE :-

PL/SQL blocks have a pre-defined structure in which the code is to be grouped. Below are
different sections of PL/SQL blocks.
1. Declaration section
2. Execution section
3. Exception-Handling section
The below picture illustrates the different PL/SQL block and their section order.

PL/SQL 153
DECLARATION SECTION :-

This is the first section of the PL/SQL blocks. This section is an optional part. This is the section
in which the declaration of variables, cursors, exceptions, subprograms, pragma instructions
and collections that are needed in the block will be declared. Below are few more
characteristics of this part.
• This particular section is optional and can be skipped if no declarations are needed.
• This should be the first section in a PL/SQL block, if present.
• This section starts with the keyword ‘DECLARE’ for triggers and anonymous block. For
other subprograms, this keyword will not be present. Instead, the part after the subprogram
name definition marks the declaration section.
• This section should always be followed by execution section.

PL/SQL 154
EXECUTION SECTION :-

Execution part is the main and mandatory part which actually executes the code that is written
inside it. Since the PL/SQL expects the executable statements from this block this cannot be an
empty block, i.e., it should have at least one valid executable code line in it. Below are few
more characteristics of this part.
• This can contain both PL/SQL code and SQL code.
• This can contain one or many blocks inside it as a nested block.
• This section starts with the keyword ‘BEGIN’.
• This section should be followed either by ‘END’ or Exception-Handling section (if
present)

EXCEPTION-HANDLING SECTION :-

The exception is unavoidable in the program which occurs at run-time and to handle this Oracle
has provided an Exception-handling section in blocks. This section can also contain PL/SQL
statements. This is an optional section of the PL/SQL blocks.
• This is the section where the exception raised in the execution block is handled.
• This section is the last part of the PL/SQL block.
• Control from this section can never return to the execution block.
• This section starts with the keyword ‘EXCEPTION’.
• This section should always be followed by the keyword ‘END’.
The Keyword ‘END’ marks the end of PL/SQL block.

KEY TAKEAWAYS:-
➢ PL/SQL (Procedural Language/Structured Query Language) is Oracle Corporation's procedural
language extension for SQL. It enables the creation of procedural logic, such as loops,
conditions, and exception handling, directly within the Oracle Database.

PL/SQL 155
PL/SQL

PL/SQL 1
SUB LESSON 10.2
DATA TYPES

➢ The Data Types in PL/SQL. The PL/SQL variables, constants and parameters must have a valid
data type, which specifies a storage format, constraints, and a valid range of values. We will
focus on the SCALAR and the LOB data types in this chapter. The other two data types will be
covered in other chapters.

S.No Category & Description

Scalar
1
Single values with no internal components, such as a NUMBER, DATE, or BOOLEAN.

Large Object (LOB)


2 Pointers to large objects that are stored separately from other data items, such as text,
graphic images, video clips, and sound waveforms.

Composite
3 Data items that have internal components that can be accessed individually. For
example, collections and records.

Reference
4
Pointers to other data items.

PL/SQL SCALAR DATA TYPES AND SUBTYPES :-

PL/SQL Scalar Data Types and Subtypes come under the following categories :-

S.No Date Type & Description

Numeric
1
Numeric values on which arithmetic operations are performed.

Character
2
Alphanumeric values that represent single characters or strings of characters.

Boolean
3
Logical values on which logical operations are performed.

PL/SQL 1
Datetime
4
Dates and times.

PL/SQL provides subtypes of data types. For example, the data type NUMBER has a subtype
called INTEGER. You can use the subtypes in your PL/SQL program to make the data types
compatible with data types in other programs while embedding the PL/SQL code in another
program, such as a Java program.

PL/SQL NUMERIC DATA TYPES AND SUB TYPES :-

Following table lists out the PL/SQL per-defined numeric data types and their sub-types –

S.No Data Type & Description

PLS_INTEGER
1
Signed integer in range -2,147,483,648 through 2,147,483,647, represented in 32 bits

BINARY_INTEGER
2
Signed integer in range -2,147,483,648 through 2,147,483,647, represented in 32 bits

BINARY_FLOAT
3
Single-precision IEEE 754-format floating-point number

BINARY_DOUBLE
4
Double-precision IEEE 754-format floating-point number

NUMBER(prec, scale)
5 Fixed-point or floating-point number with absolute value in range 1E-130 to (but not
including) 1.0E126. A NUMBER variable can also represent 0

DEC(prec, scale)
6
ANSI specific fixed-point type with maximum precision of 38 decimal digits

DECIMAL(prec, scale)
7
IBM specific fixed-point type with maximum precision of 38 decimal digits

NUMERIC(pre, secale)
8
Floating type with maximum precision of 38 decimal digits

DOUBLE PRECISION
9 ANSI specific floating-point type with maximum precision of 126 binary digits
(approximately 38 decimal digits)

PL/SQL 1
FLOAT
10 ANSI and IBM specific floating-point type with maximum precision of 126 binary digits
(approximately 38 decimal digits)

INT
11
ANSI specific integer type with maximum precision of 38 decimal digits

INTEGER
12
ANSI and IBM specific integer type with maximum precision of 38 decimal digits

SMALLINT
13
ANSI and IBM specific integer type with maximum precision of 38 decimal digits

REAL
14 Floating-point type with maximum precision of 63 binary digits (approximately 18
decimal digits)

Following is a valid declaration :-

DECLARE
num1 INTEGER;
num2 REAL;
num3 DOUBLE PRECISION;
BEGIN
null;
END;
/
When the above code is compiled and executed, it produces the following result −
PL/SQL procedure successfully completed
PL/SQL CHARACTER DATA TYPES AND SUBTYPES :-
Following is the detail of PL/SQL per-defined character data types and their sub-types −
S.No Data Type & Description

CHAR
1
A fixed-length character string with a maximum size of 32,767 bytes

VARCHAR2
2
A variable-length character string with a maximum size of 32,767 bytes

RAW
3 Variable-length binary or byte string with maximum size of 32,767 bytes, not interpreted
by PL/SQL

PL/SQL 1
NCHAR
4
A fixed-length national character string with a maximum size of 32,767 bytes

NVARCHAR2
5
A variable-length national character string with a maximum size of 32,767 bytes

LONG
6
A variable-length character string with a maximum size of 32,760 bytes

LONG RAW
7 Variable-length binary or byte string with maximum size of 32,760 bytes, not interpreted
by PL/SQL

ROWID
8
Physical row identifier, the address of a row in an ordinary table

UROWID
9
Universal row identifier (physical, logical, or foreign row identifier)

PL/SQL BOOLEAN DATA TYPES :-

The BOOLEAN data type stores logical values that are used in logical operations. The logical
values are the Boolean values TRUE and FALSE and the value NULL.
However, SQL has no data type equivalent to BOOLEAN. Therefore, Boolean values cannot be
used in −
• SQL statements
• Built-in SQL functions (such as TO_CHAR)
• PL/SQL functions invoked from SQL statements

PL/SQL DATETIME AND INTERVAL TYPES :-

The DATE datatype is used to store fixed-length date-times, which include the time of day in
seconds since midnight. Valid dates range from January 1, 4712 BC to December 31, 9999 AD.
The default date format is set by the Oracle initialization parameter NLS_DATE_FORMAT. For
example, the default might be 'DD-MON-YY', which includes a two-digit number for the day of

PL/SQL 1
the month, an abbreviation of the month name, and the last two digits of the year. For
example, 01-OCT-12.
Each DATE includes the century, year, month, day, hour, minute, and second. The following
table shows the valid values for each field –

Field Name Valid Datetime Values Valid Interval Values

YEAR -4712 to 9999 (excluding year 0) Any nonzero integer

MONTH 01 to 12 0 to 11

01 to 31 (limited by the values of MONTH


DAY and YEAR, according to the rules of the Any nonzero integer
calendar for the locale)

HOUR 00 to 23 0 to 23

MINUTE 00 to 59 0 to 59

00 to 59.9(n), where 9(n) is the precision of 0 to 59.9(n), where 9(n) is


SECOND time fractional seconds the precision of interval
fractional seconds

-12 to 14 (range accommodates daylight


TIMEZONE_HOUR Not applicable
savings time changes)

TIMEZONE_MINUTE 00 to 59 Not applicable

Found in the dynamic performance view


TIMEZONE_REGION Not applicable
V$TIMEZONE_NAMES

Found in the dynamic performance view


TIMEZONE_ABBR Not applicable
V$TIMEZONE_NAMES

PL/SQL LARGE OBJECT (LOB) DATA TYPES :-

Large Object (LOB) data types refer to large data items such as text, graphic images, video clips,
and sound waveforms. LOB data types allow efficient, random, piecewise access to this data.
Following are the predefined PL/SQL LOB data types –

Data Type Description Size

PL/SQL 1
Used to store large binary objects in System-dependent. Cannot
BFILE operating system files outside the exceed 4 gigabytes (GB).
database.

Used to store large binary objects in the 8 to 128 terabytes (TB)


BLOB
database.

Used to store large blocks of character data 8 to 128 TB


CLOB
in the database.

Used to store large blocks of NCHAR data 8 to 128 TB


NCLOB
in the database.

PL/SQL USER-DEFINED SUBTYPES :-

A subtype is a subset of another data type, which is called its base type. A subtype has the same
valid operations as its base type, but only a subset of its valid values.
PL/SQL predefines several subtypes in the package STANDARD. For example, PL/SQL predefines
the subtypes CHARACTER and INTEGER as follows −
SUBTYPE CHARACTER IS CHAR;
SUBTYPE INTEGER IS NUMBER (38,0);
You can define and use your own subtypes. The following program illustrates defining and using
a user-defined subtype −

DECLARE
SUBTYPE name IS char(20);
SUBTYPE message IS varchar2(100);
salutation name;
greetings message;
BEGIN
salutation := 'Reader ';
greetings := 'Welcome to the World of PL/SQL';
dbms_output.put_line('Hello ' || salutation || greetings);
END;

PL/SQL 1
/

When the above code is executed at the SQL prompt, it produces the following result −
Hello Reader Welcome to the World of PL/SQL

PL/SQL procedure successfully completed.

NULLS IN PL/SQL :-

PL/SQL NULL values represent missing or unknown data and they are not an integer, a
character, or any other specific data type. Note that NULL is not the same as an empty data
string or the null character value '\0'. A null can be assigned but it cannot be equated with
anything, including itself.

KEY TAKEAWAYS
➢ In a Database Management System (DBMS), data types are used to define the type of data that
can be stored in a database column. Each column in a database table is associated with a
specific data type, which determines the kind of values that can be stored in that column. The
choice of data types is crucial for ensuring data integrity, optimizing storage, and facilitating
efficient data retrieval.
➢ Data types may vary slightly between different database management systems (e.g., MySQL,
PostgreSQL, Oracle, SQL Server), and some systems may offer additional proprietary data types.
It's essential to refer to the specific documentation of the DBMS you are using to understand
the available data types and their characteristics.

PL/SQL 1
PL/SQL

CONTROL STRUCTURES, LOOPING STRUCTURES 1


SUB LESSON 10.3

CONTROL STRUCTURE, LOOPING STRUCTURES

➢ Loops in PL/SQL. There may be a situation when you need to execute a block of code several
number of times. In general, statements are executed sequentially: The first statement in a
function is executed first, followed by the second, and so on.
➢ Programming languages provide various control structures that allow for more complicated
execution paths.
➢ A loop statement allows us to execute a statement or group of statements multiple times and
following is the general form of a loop statement in most of the programming languages −

PL/SQL provides the following types of loop to handle the looping requirements. Click the
following links to check their detail.

CONTROL STRUCTURES, LOOPING STRUCTURES 1


S.No Loop Type & Description

PL/SQL Basic LOOP


In this loop structure, sequence of statements is enclosed between the LOOP and the
1
END LOOP statements. At each iteration, the sequence of statements is executed and
then control resumes at the top of the loop.

PL/SQL WHILE LOOP


Repeats a statement or group of statements while a given condition is true. It tests the
condition before executing the loop body.

2
WHILE condition LOOP
sequence_of_statements
END LOOP;

PL/SQL FOR LOOP


Execute a sequence of statements multiple times and abbreviates the code that
manages the loop variable.
DECLARE
p NUMBER := 0;
BEGIN
FOR k IN 1..500 LOOP -- calculate pi with 500 terms
3
p := p + ( ( (-1) ** (k + 1) ) / ((2 * k) - 1) );
END LOOP;
p := 4 * p;
DBMS_OUTPUT.PUT_LINE( 'pi is approximately : ' || p ); -- print result
END;
/

CONTROL STRUCTURES, LOOPING STRUCTURES 1


Nested loops in PL/SQL
4
You can use one or more loop inside any another basic loop, while, or for loop.

LABELING A PL/SQL LOOP :-

PL/SQL loops can be labeled. The label should be enclosed by double angle brackets (<< and >>)
and appear at the beginning of the LOOP statement. The label name can also appear at the end
of the LOOP statement. You may use the label in the EXIT statement to exit from the loop.
The following program illustrates the concept –

DECLARE
i number(1);
j number(1);
BEGIN
<< outer_loop >>
FOR i IN 1..3 LOOP
<< inner_loop >>
FOR j IN 1..3 LOOP
dbms_output.put_line('i is: '|| i || ' and j is: ' || j);
END loop inner_loop;
END loop outer_loop;
END;
/

When the above code is executed at the SQL prompt, it produces the following result −
i is: 1 and j is: 1
i is: 1 and j is: 2
i is: 1 and j is: 3
i is: 2 and j is: 1

CONTROL STRUCTURES, LOOPING STRUCTURES 1


i is: 2 and j is: 2
i is: 2 and j is: 3
i is: 3 and j is: 1
i is: 3 and j is: 2
i is: 3 and j is: 3

PL/SQL procedure successfully completed.

THE LOOP CONTROL STATEMENTS :-

Loop control statements change execution from its normal sequence. When execution leaves a
scope, all automatic objects that were created in that scope are destroyed.
PL/SQL supports the following control statements. Labelling loops also help in taking the control
outside a loop. Click the following links to check their details.

S.No Control Statement & Description

EXIT statement
The EXIT statement forces a loop to complete unconditionally. When an EXIT statement
1 is encountered, the loop completes immediately and control passes to the next
statement.

CONTINUE statement
2 Causes the loop to skip the remainder of its body and immediately retest its condition
prior to reiterating.

GOTO statement
3
The GOTO statement branches to a label unconditionally. The label must be unique

CONTROL STRUCTURES, LOOPING STRUCTURES 1


within its scope and must precede an executable statement or a PL/SQL block. When
executed, the GOTO statement transfers control to the labelled statement or block. The
labelled statement or block can be down or up in the sequence of statements.

KEY TAKEAWAYS
➢ In the context of Database Management Systems (DBMS), control structures and looping
structures are not explicitly used as they are in traditional programming languages. However,
SQL, which is the language commonly used to interact with relational databases, has certain
constructs and clauses that can be considered as control and looping structures.
➢ It's important to note that the specific features available for control and looping structures can
vary between different database management systems. The examples provided above are
generalized, and you should refer to the documentation of the specific DBMS you are using for
more details on the supported constructs. Additionally, while these constructs exist, it's
common to perform set-based operations in SQL rather than explicitly using control or looping
structures whenever possible for better performance.

CONTROL STRUCTURES, LOOPING STRUCTURES 1


CREATE / REPLACE / UPDATE AND ALTER VIEWS
SUB LESSON 11.1

CREATE/REPLACE VIEWS

Views in SQL are kind of virtual tables. A view also has rows and columns as they are in a real
table in the database. We can create a view by selecting fields from one or more tables present
in the database. A View can either have all the rows of a table or specific rows based on certain
conditions. In this article we will learn about creating, deleting and updating Views.

Sample Tables :-
StudentDetails

StudentMarks

CREATING VIEWS:-

We can create View using CREATE VIEW statement. A View can be created from a single table
or multiple tables.
Syntax:
CREATE VIEW view_name AS
SELECT column1, column2.....

CREATE / REPLACE / UPDATE AND ALTER VIEWS


FROM table_name
WHERE condition;
view_name: Name for the View
table_name: Name of the table
condition: Condition to select rows

Examples:

CREATING VIEW FROM A SINGLE TABLE : -

In this example we will create a View named DetailsView from the table StudentDetails. Query:
CREATE VIEW DetailsView AS
SELECT NAME, ADDRESS
FROM StudentDetails
WHERE S_ID < 5;
● To see the data in the View, we can query the view in the same manner as we query a table.
SELECT * FROM DetailsView;
Output:

● In this example, we will create a view named Student Names from the table Student Details.
Query:
CREATE VIEW Student Names AS SELECT S_ID, NAME FROM Student Details ORDER BY NAME;
● If we now query the view as,
SELECT * FROM Student Names;

CREATE / REPLACE / UPDATE AND ALTER VIEWS


Output:

CREATING VIEW FROM MULTIPLE TABLES : -

In this example we will create a View named Marks View from two tables Student Details and
Student Marks. To create a View from multiple tables we can simply include multiple tables in
the SELECT statement. Query:
CREATE VIEW Marks View AS
SELECT StudentDetails.NAME, Student Details. ADDRESS, StudentMarks. MARKS FROM
StudentDetails, StudentMarks WHERE StudentDetails.NAME = StudentMarks.NAME;
To display data of View MarksView:
SELECT * FROM MarksView;
● Output:

LISTING ALL VIEWS IN A DATABASE :-

We can list View using the SHOW FULL TABLES statement or using the information_schema
table. A View can be created from a single table or multiple tables.
Syntax (Using SHOW FULL TABLES):
use "database_name";

CREATE / REPLACE / UPDATE AND ALTER VIEWS


show full tables where table_type like "%VIEW";
Syntax (Using information_schema) :
select * from information_schema.views where table_schema = "database_name";
OR
select table_schema,table_name,view_definition from information_schema.views where
table_schema = "database_name";

DELETING VIEWS:-

We have learned about creating a View, but what if a created View is not needed anymore?
Obviously, we will want to delete it. SQL allows us to delete an existing View. We can delete or
drop a View using the DROP statement.
Syntax:
DROP VIEW view_name;
view_name: Name of the View which we want to delete.
For example, if we want to delete the View MarksView, we can do this as:
DROP VIEW MarksView;

UPDATING VIEWS :-

There are certain conditions needed to be satisfied to update a view. If any one of these
conditions is not met, then we will not be allowed to update the view.
1. The SELECT statement which is used to create the view should not include GROUP BY clause or
ORDER BY clause.
2. The SELECT statement should not have the DISTINCT keyword.
3. The View should have all NOT NULL values.
4. The view should not be created using nested queries or complex queries.
5. The view should be created from a single table. If the view is created using multiple tables then
we will not be allowed to update the view.
● We can use the CREATE OR REPLACE VIEW statement to add or remove fields from a view.

CREATE / REPLACE / UPDATE AND ALTER VIEWS


Syntax:
CREATE OR REPLACE VIEW view_name AS SELECT column1,column2,.. FROM table_name
WHERE condition;
For example, if we want to update the view MarksView and add the field AGE to this View
from StudentMarks Table, we can do this as:
CREATE OR REPLACE VIEW MarksView AS
SELECT StudentDetails.NAME, StudentDetails.ADDRESS, StudentMarks.MARKS,
StudentMarks.AGE
FROM StudentDetails, StudentMarks
WHERE StudentDetails.NAME = StudentMarks.NAME;
If we fetch all the data from MarksView now as:
SELECT * FROM MarksView;
Output:

INSERTING A ROW IN A VIEW :-

We can insert a row in a View in the same way as we do in a table. We can use the INSERT INTO
statement of SQL to insert a row in a View.
Syntax:
INSERT INTO view_name(column1, column2 , column3,..) VALUES (value1, value2, value3.);
view_name: Name of the View
Example: In the below example we will insert a new row in the View DetailsView which we have
created above in the example of “creating views from a single table”.

INSERT INTO DetailsView(NAME, ADDRESS) VALUES("Suresh","Gurgaon");


If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
Output:

CREATE / REPLACE / UPDATE AND ALTER VIEWS


DELETING A ROW FROM A VIEW :-

Deleting rows from a view is also as simple as deleting rows from a table. We can use the
DELETE statement of SQL to delete rows from a view. Also deleting a row from a view first
delete the row from the actual table and the change is then reflected in the view.
Syntax:
DELETE FROM view_name WHERE condition;
view_name: Name of view from where we want to delete rows.
condition: Condition to select rows.
Example: In this example, we will delete the last row from the view DetailsView which we just
added in the above example of inserting rows.
DELETE FROM DetailsView WHERE NAME="Suresh";
If we fetch all the data from DetailsView now as,
SELECT * FROM DetailsView;
Output:

CREATE / REPLACE / UPDATE AND ALTER VIEWS


WITH CHECK OPTION :-

The WITH CHECK OPTION clause in SQL is a very useful clause for views. It is applicable to an
updatable view. If the view is not updatable, then there is no meaning of including this clause in
the CREATE VIEW statement.
● The WITH CHECK OPTION clause is used to prevent the insertion of rows in the view where the
condition in the WHERE clause in the CREATE VIEW statement is not satisfied.
● If we have used the WITH CHECK OPTION clause in the CREATE VIEW statement, and if the
UPDATE or INSERT clause does not satisfy the conditions then they will return an error.
Example: In the below example we are creating a View SampleView from StudentDetails Table
with WITH CHECK OPTION clause.
CREATE VIEW SampleView AS SELECT S_ID, NAME FROM StudentDetails WHERE NAME IS NOT
NULL WITH CHECK OPTION;
In this View, if we now try to insert a new row with a null value in the NAME column then it will
give an error because the view is created with the condition for the NAME column as NOT
NULL. For example, though the View is updatable but then also the below query for this View is
not valid:
INSERT INTO SampleView(S_ID) VALUES(6);
NOTE: The default value of the NAME column is null.
Uses of a View: A good database should contain views for the given reasons:
1. Restricting data access – Views provide an additional level of table security by restricting access
to a predetermined set of rows and columns of a table.
2. Hiding data complexity – A view can hide the complexity that exists in multiple tables join.
3. Simplify commands for the user – Views allow the user to select information from multiple
tables without requiring the users to actually know how to perform a join.
4. Store complex queries – Views can be used to store complex queries.
5. Rename Columns – Views can also be used to rename the columns without affecting the base
tables provided the number of columns in view must match the number of columns specified in
select statement. Thus, renaming helps to hide the names of the columns of the base tables.
6. Multiple view facility – Different views can be created on the same table for different users.

CREATE / REPLACE / UPDATE AND ALTER VIEWS


KEY TAKEAWAYS :-
In the context of a database management system (DBMS), the term "views" refers to virtual
tables generated by queries on one or more base tables. Views offer several advantages in a
DBMS environment:
⮚ Data Abstraction :-
Simplified Access: Views provide a simplified and abstracted interface to the underlying data.
Users can interact with the data through views without needing to understand the complex
structure of the underlying tables.
⮚ Security :-
Selective Access: Views can be used to restrict access to certain columns or rows of a table. This
enables administrators to control the data that users can see, enhancing security and privacy.
⮚ Simplified Querying :-
Complex Queries: Views can be used to encapsulate complex queries, making it easier for users
to execute common or complex queries without needing to write the full SQL code each time.
⮚ Data Integrity :-
Enforcement of Business Rules: Views can be designed to enforce business rules, ensuring that
only valid data is visible or modifiable through the view. This helps in maintaining data integrity.
⮚ Performance Optimization :-
Subset of Data: Views allow for the creation of subsets of data that are relevant to specific
applications or user groups. This can improve query performance by focusing on a smaller set of
data.
⮚ Logical Data Independence :-
Changes to Underlying Schema: Views provide a layer of abstraction between the physical
schema (how data is stored) and the logical schema (how data is perceived). This means that
changes to the underlying tables' structure do not necessarily affect applications using views.
⮚ Customizationn :-
Personalized Views: Users can create their own views tailored to their specific needs, selecting
only the columns and rows they are interested in. This customization enhances user experience.

CREATE / REPLACE / UPDATE AND ALTER VIEWS


⮚ Consistency :-
Standardized Data Presentation: Views allow for the creation of standardized data
presentations. This ensures consistency in how data is presented and reduces the likelihood of
errors in querying or reporting.
⮚ Encapsulation :-
Encapsulating Complex Logic: Views can encapsulate complex data retrieval logic, making it
easier to manage and maintain. This is especially useful when dealing with calculations,
aggregations, or transformations.
⮚ Dependency Management :-
Reduced Dependency on Table Structure: Applications and reports can be developed based on
views, reducing the dependency on the underlying table structure. This makes it easier to adapt
to changes in the database schema.
While views offer these advantages, it's important to consider their potential impact on
performance, as complex views or nested views may introduce overhead. Careful design and
optimization are crucial for maximizing the benefits of views in a DBMS.

CREATE / REPLACE / UPDATE AND ALTER VIEWS


Stored Procedure

STORED PROCEDURE
SUB LESSON 12.1

UNDERSTANDING THE MAIN FEATURES OF STORED PROCEDURE


A stored procedure is a group of one or more pre-compiled SQL statements into a logical unit.
It is stored as an object inside the database server. It is a subroutine or a subprogram in the
common computing language that has been created and stored in the database. Each
procedure in SQL Server always contains a name, parameter lists, and Transact-SQL
statements. The SQL Database Server stores the stored procedures as named objects. We can
invoke the procedures by using triggers, other procedures, and applications such
as Java, Python, PHP, etc. It can support almost all relational database systems.
SQL Server builds an execution plan when the stored procedure is called the first time and
stores them in the cache memory. The plan is reused by SQL Server in subsequent executions of
the stored procedure, allowing it to run quickly and efficiently.

FEATURES OF STORED PROCEDURES :-

The following are the features of stored procedure :-


o Reduced Traffic :- A stored procedure reduces network traffic between the application and the
database server, resulting in increased performance. It is because instead of sending
several SQL statements, the application only needs to send the name of the stored procedure
and its parameters.
o Stronger Security :- The procedure is always secure because it manages which processes and
activities we can perform. It removes the need for permissions to be granted at the database
object level and simplifies the security layers.
o Reusable :- Stored procedures are reusable. It reduces code inconsistency, prevents
unnecessary rewrites of the same code, and makes the code transparent to all applications or
users.
o Easy Maintenance :- The procedures are easier to maintain without restarting or deploying the
application.

STORED PROCEDURE
o Improved Performance :- Stored Procedure increases the application performance. Once we
create the stored procedures and compile them the first time, it creates an execution plan
reused for subsequent executions. The procedure is usually processed quicker because the
query processor does not have to create a new plan.

TYPES OF STORED PROCEDURES :-

1. User-defined Stored Procedures


2. System Stored Procedures

USER-DEFINED STORED PROCEDURES :-

Database developers or database administrators build user-defined stored procedures. These


procedures provide one or more SQL statements for selecting, updating, or removing data from
database tables. A stored procedure specified by the user accepts input parameters and returns
output parameters. DDL and DML commands are used together in a user-defined procedure.
We can further divide this procedure into two types:
o T-SQL Stored Procedures: Transact-SQL procedures are one of the most popular types of SQL
Server procedures. It takes parameters and returns them. These procedures handle INSERT,
UPDATE, and DELETE statements with or without parameters and output row data.
o CLR Stored Procedures: The SQL Server procedures are a group of SQL commands, and the CLR
indicates the common language runtime. CLR stored procedures are made up of the CLR and a
stored procedure, which is written in a CLR-based language like VB.NET or C#. CLR procedures
are .Net objects that run in the SQL Server database's memory.
o

SYSTEM STORED PROCEDURES :-

The server's administrative tasks depend primarily on system stored procedures. When SQL
Server is installed, it creates system procedures. The system stored procedures prevent the
administrator from querying or modifying the system and database catalog tables directly.
Developers often ignore system stored procedures.

STORED PROCEDURES SYNTAX:-

STORED PROCEDURE
CREATE PROCEDURE [schema_name].procedure_name
@parameter_name data_type,
....
parameter_name data_type
AS
BEGIN
-- SQL statements
-- SELECT, INSERT, UPDATE, or DELETE statement
END

Schema_name: It is the name of your database or schema. By default, a procedure is associated


with the current database, but we can also create it into another database by specifying the DB
name.

Procedure_Name: It represents the name of your stored procedure that should be meaningful
names so that you can identify them quickly. It cannot be the system reserved keywords.

Parameter_Name: It represents the number of parameters. It may be zero or more based upon
the user requirements. We must ensure that we used the appropriate data type. For example,
@Name VARCHAR(50).

SET NOCOUNT ON IN STORED PROCEDURE : -

In some cases, we use the SET NOCOUNT ON statement in the stored procedure. This
statement prevents the message that displays the number of rows affected by SQL queries
from being shown. NOCOUNT denotes that the count is turned off. It means that if SET
NOCOUNT ON is set, no message would appear indicating the number of rows affected.

HOW TO EXECUTE/CALL A STORED PROCEDURE ?

We can use the EXEC command to call/execute stored procedures in SQL Server. The following
syntax illustrates the execution of a stored procedure:
EXEC procedure_name;
Or,

STORED PROCEDURE
EXECUTE procedure_name;

Example :
CREATE PROCEDURE studentList
AS
BEGIN
SELECT name, age, salary
FROM STUDENT
ORDER BY salary;
END;

KEY TAKEAWAYS:-

Stored procedures are a type of database object in database management systems (DBMS) that
encapsulate a series of SQL statements and procedural logic. They provide several features and
advantages for database development and management. Here are some key features of stored
procedures:
Parameter Support :- Stored procedures can accept parameters, allowing them to receive input
values when executed. This makes procedures versatile and reusable with different input
values.
Encapsulation of Logic :- Stored procedures encapsulate procedural logic, allowing developers
to organize and centralize complex business logic within the database. This promotes code
reusability and maintainability.
Transaction Control :- Stored procedures can include explicit transaction control statements
(BEGIN TRANSACTION, COMMIT, ROLLBACK) to manage transactions. This is useful for ensuring
data consistency and integrity.
Error Handling :- Stored procedures can include error-handling mechanisms, such as
TRY...CATCH blocks, to manage exceptions and handle errors gracefully.

STORED PROCEDURE
Dynamic SQL :- Stored procedures can generate and execute dynamic SQL statements,
allowing for more flexibility in constructing and executing queries based on runtime conditions.
Improved Performance :- Stored procedures can improve performance by precompiling and
optimizing SQL statements, reducing the overhead associated with repeated query compilation.
Access Control :- Stored procedures provide a mechanism for controlling access to data. Users
can be granted permissions to execute specific stored procedures while restricting direct access
to underlying tables.
Code Reusability :- Since stored procedures encapsulate logic, they can be reused across
different parts of an application or by different applications, promoting modular development.
Reduced Network Traffic :- By executing logic on the database server, stored procedures can
reduce the amount of data transferred between the database server and the client application,
resulting in improved performance.
Enhanced Security :- Stored procedures can enhance security by allowing the database
administrator to control access to data and operations. Users may only be granted permission
to execute specific stored procedures, limiting their interaction with the database.

STORED PROCEDURE
STORED PROCEDURES

STORED PROCEDURES
SUB LESSON 12.2

STORED PROCEDURE ARCHITECTURE

STORED PROCEDURES SYNTAX :-

CREATE PROCEDURE [schema_name].procedure_name


@parameter_name data_type,
....
parameter_name data_type
AS
BEGIN
-- SQL statements
-- SELECT, INSERT, UPDATE, or DELETE statement
END

Schema_name: It is the name of your database or schema. By default, a procedure is associated


with the current database, but we can also create it into another database by specifying the DB
name.

Procedure_Name: It represents the name of your stored procedure that should be meaningful
names so that you can identify them quickly. It cannot be the system reserved keywords.

Parameter_Name: It represents the number of parameters. It may be zero or more based upon
the user requirements. We must ensure that we used the appropriate data type. For example,
@Name VARCHAR(50).

SET NOCOUNT ON IN STORED PROCEDURE : -

In some cases, we use the SET NOCOUNT ON statement in the stored procedure. This
statement prevents the message that displays the number of rows affected by SQL queries
from being shown. NOCOUNT denotes that the count is turned off. It means that if SET
NOCOUNT ON is set, no message would appear indicating the number of rows affected.

HOW TO EXECUTE/CALL A STORED PROCEDURE ?

STORED PROCEDURES
We can use the EXEC command to call/execute stored procedures in SQL Server. The following
syntax illustrates the execution of a stored procedure:
EXEC procedure_name;
Or,
EXECUTE procedure_name;

Example :
CREATE PROCEDURE studentList
AS
BEGIN
SELECT name, age, salary
FROM STUDENT
ORDER BY salary;
END;

KEY TAKEAWAYS :-
stored procedures are a type of database object in database management systems (DBMS) that
encapsulate a series of SQL statements and procedural logic. They provide several features and
advantages for database development and management. Here are some key features of stored
procedures:
Parameter Support :- Stored procedures can accept parameters, allowing them to receive input
values when executed. This makes procedures versatile and reusable with different input
values.
Encapsulation of Logic :- Stored procedures encapsulate procedural logic, allowing developers
to organize and centralize complex business logic within the database. This promotes code
reusability and maintainability.
Transaction Control :- Stored procedures can include explicit transaction control statements
(BEGIN TRANSACTION, COMMIT, ROLLBACK) to manage transactions. This is useful for ensuring
data consistency and integrity.

STORED PROCEDURES
Error Handling :- Stored procedures can include error-handling mechanisms, such as
TRY...CATCH blocks, to manage exceptions and handle errors gracefully.
Dynamic SQL :- Stored procedures can generate and execute dynamic SQL statements, allowing
for more flexibility in constructing and executing queries based on runtime conditions.
Improved Performance :- Stored procedures can improve performance by precompiling and
optimizing SQL statements, reducing the overhead associated with repeated query compilation.
Access Control :- Stored procedures provide a mechanism for controlling access to data. Users
can be granted permissions to execute specific stored procedures while restricting direct access
to underlying tables.
Code Reusability :- Since stored procedures encapsulate logic, they can be reused across
different parts of an application or by different applications, promoting modular development.
Reduced Network Traffic :- By executing logic on the database server, stored procedures can
reduce the amount of data transferred between the database server and the client application,
resulting in improved performance.
Enhanced Security :- Stored procedures can enhance security by allowing the database
administrator to control access to data and operations. Users may only be granted permission
to execute specific stored procedures, limiting their interaction with the database.

STORED PROCEDURES
STORED PROCEDURES

STORED PROCEDURES
SUB LESSON 12.3

ADVANTAGES OF USING STORED PROCEDURE

A stored procedure is a group of one or more pre-compiled SQL statements into a logical unit.
It is stored as an object inside the database server. It is a subroutine or a subprogram in the
common computing language that has been created and stored in the database. Each
procedure in SQL Server always contains a name, parameter lists, and Transact-SQL
statements. The SQL Database Server stores the stored procedures as named objects. We can
invoke the procedures by using triggers, other procedures, and applications such
as Java, Python, PHP, etc. It can support almost all relational database systems.
SQL Server builds an execution plan when the stored procedure is called the first time and
stores them in the cache memory. The plan is reused by SQL Server in subsequent executions of
the stored procedure, allowing it to run quickly and efficiently.

FEATURES OF STORED PROCEDURES:-

The following are the features of stored procedure :-


o Reduced Traffic: A stored procedure reduces network traffic between the application and the
database server, resulting in increased performance. It is because instead of sending
several SQL statements, the application only needs to send the name of the stored procedure
and its parameters.
o Stronger Security: The procedure is always secure because it manages which processes and
activities we can perform. It removes the need for permissions to be granted at the database
object level and simplifies the security layers.
o Reusable: Stored procedures are reusable. It reduces code inconsistency, prevents unnecessary
rewrites of the same code, and makes the code transparent to all applications or users.
o Easy Maintenance: The procedures are easier to maintain without restarting or deploying the
application.
o Improved Performance: Stored Procedure increases the application performance. Once we
create the stored procedures and compile them the first time, it creates an execution plan

STORED PROCEDURES
reused for subsequent executions. The procedure is usually processed quicker because the
query processor does not have to create a new plan.

KEY TAKEAWAYS:-

Stored procedures are a type of database object in database management systems (DBMS) that
encapsulate a series of SQL statements and procedural logic. They provide several features and
advantages for database development and management. Here are some key features of stored
procedures:

Parameter Support :- Stored procedures can accept parameters, allowing them to receive input
values when executed. This makes procedures versatile and reusable with different input
values.
Encapsulation of Logic :- Stored procedures encapsulate procedural logic, allowing developers
to organize and centralize complex business logic within the database. This promotes code
reusability and maintainability.
Transaction Control :- Stored procedures can include explicit transaction control statements
(BEGIN TRANSACTION, COMMIT, ROLLBACK) to manage transactions. This is useful for ensuring
data consistency and integrity.
Error Handling :- Stored procedures can include error-handling mechanisms, such as
TRY...CATCH blocks, to manage exceptions and handle errors gracefully.
Dynamic SQL :- Stored procedures can generate and execute dynamic SQL statements, allowing
for more flexibility in constructing and executing queries based on runtime conditions.
Improved Performance :- Stored procedures can improve performance by precompiling and
optimizing SQL statements, reducing the overhead associated with repeated query compilation.
Access Control :- Stored procedures provide a mechanism for controlling access to data. Users
can be granted permissions to execute specific stored procedures while restricting direct access
to underlying tables.

STORED PROCEDURES
Code Reusability :- Since stored procedures encapsulate logic, they can be reused across
different parts of an application or by different applications, promoting modular development.

Reduced Network Traffic :- By executing logic on the database server, stored procedures can
reduce the amount of data transferred between the database server and the client application,
resulting in improved performance.
Enhanced Security :- Stored procedures can enhance security by allowing the database
administrator to control access to data and operations. Users may only be granted permission
to execute specific stored procedures, limiting their interaction with the database.

STORED PROCEDURES
Triggers

FUNDAMENTALS OF DATABASE TRIGGERS 1


SUB LESSON 13.1

FUNDAMENTALS OF DATABASE TRIGGERS

WHAT IS A TRIGGER IN DBMS ?

⮚ Triggers are the SQL statements that are automatically executed when there is any change in
the database. The triggers are executed in response to certain events (INSERT, UPDATE or
DELETE) in a particular table. These triggers help in maintaining the integrity of the data by
changing the data of the database in a systematic fashion. For example, when a new record
(representing a new worker) is added to the employees table, new records should also be
created in the tables of the taxes, vacations and salaries. Triggers can also be used to log
historical data, for example to keep track of employees' previous salaries.
Schema-level triggers
● After Creation
● Before Alter
● After Alter
● Before Drop
● After Drop
● Before Insert
The four main types of triggers are :-
1. Row-level trigger: This gets executed before or after any column value of a row changes.
2. Column-level trigger: This gets executed before or after the specified column changes.
3. For each row type: This trigger gets executed once for each row of the result set affected by an
insert/update/delete.
4. For each statement type: This trigger gets executed only once for the entire result set, but also
fires each time the statement is executed.

FUNDAMENTALS OF DATABASE TRIGGERS 2


Syntax
create trigger Trigger_name
(before | after)
[insert | update | delete]
on [table_name]
[for each row]
[trigger_body]
1. CREATE TRIGGER: These two keywords specify that a triggered block is going to be declared.
2. TRIGGER_NAME: It creates or replaces an existing trigger with the Trigger_name. The trigger
name should be unique.
3. BEFORE | AFTER: It specifies when the trigger will be initiated i.e. before the ongoing event or
after the ongoing event.
4. INSERT | UPDATE | DELETE : These are the DML operations and we can use either of them in a
given trigger.
5. ON[TABLE_NAME]: It specifies the name of the table on which the trigger is going to be
applied.
6. FOR EACH ROW: Row-level trigger gets executed when any row value of any column changes.
7. TRIGGER BODY: It consists of queries that need to be executed when the trigger is called.
Example
Suppose we have a table named Student containing the attributes Student_id, Name, Address,
and Marks.

Now, we want to create a trigger that will add 100 marks to each new row of the Marks column
whenever a new student is inserted to the table.

FUNDAMENTALS OF DATABASE TRIGGERS 3


The SQL Trigger will be:
CREATE TRIGGER Add_marks
BEFORE
INSERT
ON Student
FOR EACH ROW
SET new.Marks = new.Marks + 100;

The new keyword refers to the row that is getting affected.


After creating the trigger, we will write the query for inserting a new student in the database.
INSERT INTO Student(Name, Address, Marks) VALUES('Alizeh', 'Maldives', 110);

The Student_id column is an auto-increment field and will be generated automatically when a
new record is inserted into the table.

To see the final output the query would be:


SELECT * FROM Student;

ADVANTAGES OF TRIGGERS :-

1. Triggers provide a way to check the integrity of the data. When there is a change in the
database the triggers can adjust the entire database.
2. Triggers help in keeping User Interface lightweight. Instead of putting the same function call all
over the application you can put a trigger and it will be executed.

FUNDAMENTALS OF DATABASE TRIGGERS 4


DISADVANTAGES OF TRIGGERS :-

1. Triggers may be difficult to troubleshoot as they execute automatically in the database. If there
is some error, then it is hard to find the logic of trigger because they are fired before or after
updates/inserts happen.
2. The triggers may increase the overhead of the database as they are executed every time any
field is updated.

KEY TAKEAWAYS:-
⮚ In the context of a database management system (DBMS), a trigger is a set of instructions or a
program that is automatically executed ("triggered") in response to a specific event or condition
occurring in the database. These events or conditions could include actions like inserting,
updating, or deleting records in a table.

⮚ Triggers are often used to enforce business rules, data integrity, and maintain consistency
within a database.

⮚ Triggers are defined using a specific syntax provided by the database management system. The
trigger code is associated with a specific table and event (e.g., before or after an insert, update,
or delete operation). When the specified event occurs, the trigger is automatically invoked, and
its associated code is executed.

⮚ It's important to use triggers judiciously, as they can introduce complexity and potential
performance overhead. Overreliance on triggers may also make it challenging to understand
and maintain the database logic.

FUNDAMENTALS OF DATABASE TRIGGERS 5


TRANSACTIONAL CONTROL

TRANSACTIONAL CONTROL 6
SUB LESSON 14.1

TRANSACTIONAL CONTROL

TRANSACTION PROPERTY :-

The transaction has the four properties. These are used to maintain consistency in a database,
before and after the transaction.

PROPERTY OF TRANSACTION :-

1. Atomicity
2. Consistency
3. Isolation
4. Durability

ATOMICITY :-

➢ It states that all operations of the transaction take place at once if not, the transaction is
aborted.
➢ There is no midway, i.e., the transaction cannot occur partially. Each transaction is
treated as one unit and either run to completion or is not executed at all.
Atomicity involves the following two operations:
Abort: If a transaction aborts then all the changes made are not visible.
Commit: If a transaction commits then all the changes made are visible.
Example: Let's assume that following transaction T consisting of T1 and T2. A consists of Rs 600
and B consists of Rs 300. Transfer Rs 100 from account A to account B.
After completion of the transaction, A consists of Rs 500 and B consists of Rs 400.
If the transaction T fails after the completion of transaction T1 but before completion of
transaction T2, then the amount will be deducted from A but not added to B. This shows the
inconsistent database state. In order to ensure correctness of database state, the transaction
must be executed in entirety.

TRANSACTIONAL CONTROL 7
CONSISTENCY :-

➢ The integrity constraints are maintained so that the database is consistent before and
after the transaction.
➢ The execution of a transaction will leave a database in either its prior stable state or a
new stable state.
➢ The consistent property of database states that every transaction sees a consistent
database instance.
➢ The transaction is used to transform the database from one consistent state to another
consistent state.
For example: The total amount must be maintained before or after the transaction.
1. Total before T occurs = 600+300=900
2. Total after T occurs= 500+400=900
Therefore, the database is consistent. In the case when T1 is completed but T2 fails, then
inconsistency will occur.

ISOLATION :-

➢ It shows that the data which is used at the time of execution of a transaction cannot be
used by the second transaction until the first one is completed.
➢ In isolation, if the transaction T1 is being executed and using the data item X, then that
data item can't be accessed by any other transaction T2 until the transaction T1 ends.
➢ The concurrency control subsystem of the DBMS enforced the isolation property.

DURABILITY :-

➢ The durability property is used to indicate the performance of the database's consistent
state. It states that the transaction made the permanent changes.
➢ They cannot be lost by the erroneous operation of a faulty transaction or by the system
failure. When a transaction is completed, then the database reaches a state known as the
consistent state. That consistent state cannot be lost, even in the event of a system's failure.
➢ The recovery subsystem of the DBMS has the responsibility of Durability property.

TRANSACTIONAL CONTROL 8
STATES OF TRANSACTION :-

In a database, the transaction can be in one of the following states :-

ACTIVE STATE :-

➢ The active state is the first state of every transaction. In this state, the transaction is
being executed.
➢ For example: Insertion or deletion or updating a record is done here. But all the records
are still not saved to the database.

PARTIALLY COMMITTED :-

➢ In the partially committed state, a transaction executes its final operation, but the data
is still not saved to the database.
➢ In the total mark calculation example, a final display of the total marks step is executed
in this state.

TRANSACTIONAL CONTROL 9
COMMITTED :-

➢ A transaction is said to be in a committed state if it executes all its operations successfully. In


this state, all the effects are now permanently saved on the database system.

FAILED STATE :-

➢ If any of the checks made by the database recovery system fails, then the transaction is
said to be in the failed state.
➢ In the example of total mark calculation, if the database is not able to fire a query to
fetch the marks, then the transaction will fail to execute.

ABORTED :-

➢ If any of the checks fail and the transaction has reached a failed state then the database
recovery system will make sure that the database is in its previous consistent state. If not then
it will abort or roll back the transaction to bring the database into a consistent state.
➢ If the transaction fails in the middle of the transaction then before executing the transaction, all
the executed transactions are rolled back to its consistent state.
➢ After aborting the transaction, the database recovery module will select one of the two
operations:
o Re-start the transaction
o Kill the transaction

TRANSACTIONAL CONTROL 10
KEY TAKEAWAYS:-
➢ Transactional control in a database management system (DBMS) refers to the mechanisms and
commands that allow you to manage the transactions that occur within the database. A
transaction is a sequence of one or more database operations that are executed as a single unit
of work. These operations can include reading from or writing to the database.
➢ Transactional control ensures that database transactions are executed in a way that maintains
the consistency, isolation, durability, and atomicity properties of the database.

TRANSACTIONAL CONTROL 11
TRANSACTIONAL CONTROL

TCL COMMANDS 12
SUB LESSON 14.2

TCL COMMANDS

➢ TCL stands for Transaction Control Languages. These commands are used for maintaining
consistency of the database and for the management of transactions made by the DML
commands.
➢ A Transaction is a set of SQL statements that are executed on the data stored in DBMS.
Whenever any transaction is made these transactions are temporarily happen in database. So
to make the changes permanent, we use TCL commands.

APPLICATIONS OF TCL :-

➢ Committing Transactions: TCL statements can be used to commit a transaction, which means to
permanently save the changes made during the transaction to the database.
➢ Rolling Back Transactions: TCL statements can be used to roll back a transaction, which means
to undo the changes made during the transaction and restore the database to its previous
state.
➢ Setting Transaction Isolation Levels: TCL statements can be used to set the transaction isolation
level, which determines the level of concurrency and consistency in the database.
➢ Savepoints: TCL statements can be used to set savepoints within a transaction, allowing for
partial rollback if needed.
➢ Managing Transactions in Stored Procedures: TCL statements can be used in stored procedures
to manage transactions within the scope of the procedure.
➢ Overall, TCL is an essential part of SQL and is used extensively in database management
systems to control transactions and ensure data consistency. By using TCL statements, database
administrators and developers can manage transactions effectively and maintain the integrity
of their databases.
The TCL commands are:
➢ COMMIT
➢ ROLLBACK

TCL COMMANDS 13
➢ SAVEPOINT

1. COMMIT :-

This command is used to save the data permanently.


Whenever we perform any of the DML command like -INSERT, DELETE or UPDATE, these can be
rollback if the data is not stored permanently. So in order to be at the safer side COMMIT
command is used.
Syntax:
commit;

2. ROLLBACK :-

This command is used to get the data or restore the data to the last savepoint or last
committed state. If due to some reasons the data inserted, deleted or updated is not correct,
you can rollback the data to a particular savepoint or if savepoint is not done, then to the last
committed state.
Syntax:
rollback;

3. SAVEPOINT :-

This command is used to save the data at a particular point temporarily, so that whenever
needed can be rollback to that particular point.
Syntax:
Savepoint A;
Consider the following Table Student:

Name Marks
John 79
Jolly 65
Shuzan 70
UPDATE STUDENT
SET NAME = ‘Sherlock’

TCL COMMANDS 14
WHERE NAME = ‘Jolly’;

COMMIT;

ROLLBACK :-

By using this command you can update the record and save it permanently by using COMMIT
command.
Now after COMMIT:
Name Marks
John 79
Sherlock 65
Shuzan 70
If commit was not performed then the changes made by the update command can be rollback.
Now if no COMMIT is performed.
UPDATE STUDENT
SET NAME = ‘Sherlock’
WHERE STUDENT_NAME = ‘Jolly’;
After update command the table will be:
Name Marks
John 79
Sherlock 65
Shuzan 70
Now if ROLLBACK is performed on the above table:
rollback;
After Rollback:
Name Marks
John 79
Jolly 65

TCL COMMANDS 15
Shuzan 70
If on the above table savepoint is performed:
INSERT into STUDENT
VALUES ('Jack', 95);
Commit;
UPDATE NAME
SET NAME= ‘Rossie’
WHERE marks= 70;
SAVEPOINT A;
INSERT INTO STUDENT
VALUES (‘Zack’, 76);
Savepoint B;
INSERT INTO STUDENT
VALUES (‘Bruno’, 85);
Savepoint C;
SELECT *
FROM STUDENT;
Name Marks
John 79
Jolly 65
Rossie 70
Jack 95
Zack 76
Bruno 85
Now if we Rollback to Savepoint B:
Rollback to B;
The resulting Table will be-
Name Marks
John 79

TCL COMMANDS 16
Jolly 65
Rossie 70
Jack 95
Zack 76
Now if we Rollback to Savepoint A:
Rollback to A;
The resulting Table will be-
Name Marks
John 79
Jolly 65
Rossie 70
Jack 95

KEY TAKEAWAYS :-
➢ In the context of a Database Management System (DBMS), TCL stands for Transaction Control
Language. TCL commands are used to manage transactions in a database, controlling the
beginning and ending of transactions and ensuring the consistency and integrity of the data.
The two primary TCL commands are COMMIT and ROLLBACK.

TCL COMMANDS 17
TYPES OF LOCKS

TYPES OF LOCKS
SUB LESSON 15.1

ROW LEVEL LOCKS & TABLE LEVEL LOCKS

LEVELS OF LOCKING IN DBMS :-

The locking in a database can be done at 4 levels, which start with the database at the highest
level and down via table and page to the row at the lowest level.
● Database Level
● Table Level
● Page-Level
● Row Level
Before we discuss the levels of locking, we should know about the types of locks or lock mode.
So, there are 5 lock types in the locking, and these are discussed below:
● Exclusive (X) Lock :-
This method of locking differentiates the locks based on their usage. This also ensures that the
data or information of a page will be reserved exclusively for those transactions that use the
exclusive lock. These locks are applied on those resources only where a WRITE operation is
performed. These locks can be applied to a page only if there is no other shared or exclusive
type of lock is applied already.
● Shared (S) Lock :-
This method of locking is applied only to the read operations. If this lock is applied to any row or
page, then it will reserve that row or page for the read operation. We can apply more than one
lock on the same row or the same page, but it should not apply to any other type of lock.
● Intent exclusive (IX) Lock :-
This method of locking explicit locking at a lower level with exclusive or shared locks. This
means that if a transaction has used this type of lock, then it must be a case of modifying the
lower level of resources by imposing an exclusive lock separately.

TYPES OF LOCKS
● Intent shared (IS) Lock :-
This method of locking is explicit locking at a lower level of the tree but only with shared locks.
This means that if a transaction has used this type of lock, then it must be a case of reading the
lower level of resources by imposing a shared lock separately.

● Shared intent exclusive (SIX) Lock :-


This method of locking states that the transaction is used to read the resources at a lower level.
Here in SIX, we impose the shared lock on all the resources that are available at the lower level.
The sub-tree rooted by that node is locked explicitly in shared mode and explicit locking is done
at a lower level with exclusive mode locks. In this method, only one SIX can be acquired on a
relation at a time and if there are any other transactions for updating any change, then it will
block those transactions.

● Update (U) Lock :-


This method of locking can be imposed on a record that already consists of a shared lock and if
it has a shared lock already, then the update lock will impose another shared lock on the target
row or page of relation. This is the same as an exclusive lock and also in some ways flexible.
Here in this lock, after checking that the transaction holds the update lock for modifying the
data, then the update lock will be modified into an exclusive lock.

Drawbacks of locking :-
1. May not be free from recoverability.
2. Not free from deadlock.
3. Not free from starvation.
4. Not free from cascading rollback.

TYPES OF LOCKS
Row Level :-
This level of locking is less restrictive as compared to other levels of locking. At the row level, if
a concurrent transaction is accessing different rows in the same relationship, even if the rows
are located on the same page, then this can be accepted by database systems. Here at this
level, DBMS allows concurrent transactions to access the rows of the same relation even in the
case where the rows are on the same page. At this level, we can apply 3 different types of
locks.
i.e., Exclusive, Shared, Update and these locks have already been discussed above and also,
particular rows are locked in a query on this level.

Applications of Row level:


● It is very costly compared to other levels of locking.
● It’s also very restrictive.

Examples:

TYPES OF LOCKS
First session:
mysql> START TRANSACTION;
mysql> select capital, code2 from country where code2 = 'ZW' for update;
capital code2

4068 ZW

1 row in set (0.00 sec)

Second Session:
mysql> update country set capital=5000 where code2 = 'ZW';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting the transaction

Table Level:
At the table level, the full table or relation is locked. Now, If let’s say there are two relations in a
database say, R1 and R2, where R1 uses tables, then R2 cannot use it. However, two
transactions can access the same database only if they are accessing different relations. A
transaction using a level of the table will hold shared and/or exclusive table locks. At the table
level, there are 5 different types of locks.

I.e., Exclusive (X), Shared (S), Intent exclusive (IX), Intent shared (IS), and shared with intent
exclusive (SIX) and these locks have already been discussed above.

Applications of Table level –


● This type of locking level is not suitable for multi-user database management systems.
● It is also primarily used in preventing a relation from being dropped in a DML operation.

TYPES OF LOCKS
KEY TAKEAWAYS:-

Locks in Database Management Systems (DBMS) are mechanisms used to control access to
shared resources, such as database tables or rows, to ensure the consistency and integrity of
data in a multi-user environment. Locks prevent conflicts that can arise when multiple
transactions try to access or modify the same data concurrently. There are various types of
locks, and their usage depends on the level of isolation required by the system.

The management of locks is a critical aspect of database systems to ensure data consistency
and avoid conflicts between concurrent transactions. However, improper use or excessive
locking can lead to performance issues, such as reduced concurrency. Database systems often
provide various isolation levels and locking strategies to balance consistency and performance,
and the specific implementation details may vary between different database systems.

TYPES OF LOCKS
TYPES OF LOCKS

TYPES OF LOCKS
SUB LESSON 15.2

EXCLUSIVE LOCK AND SHARED LOCK

EXCLUSIVE LOCK (X) :-

This method of locking differentiates the locks based on their usage. This also ensures that the
data or information of a page will be reserved exclusively for those transactions that use the
exclusive lock. These locks are applied on those resources only where a WRITE operation is
performed. These locks can be applied to a page only if there is no other shared or exclusive
type of lock is applied already.
● When a statement modifies data, its transaction holds an exclusive lock on data that prevents
other transactions from accessing the data.
● This lock remains in place until the transaction holding the lock issues a commit or rollback.
● They can be owned by only one transaction at a time.
● With the Exclusive Lock, a data item can be read as well as written. Also called write lock.
● Any transaction that requires an exclusive lock must wait if another transaction currently owns
an exclusive lock or a shared lock against the requested resource.
● They can be owned by only one transaction at a time.
● It is denoted as Lock-X.
● X-lock is requested using Lock-X instruction.
For example, consider a case where initially A=100 when a transaction needs to deduct 50 from
A. We can allow this transaction by placing an X lock on it. Therefore, when any other
transaction wants to read or write, an exclusive lock prevents it.
Lock Compatibility Matrix:

TYPES OF LOCKS
● If transaction T1 is holding a shared lock in data item A, then the control manager can grant the
shared lock to transaction T2 as compatibility is TRUE, but it cannot grant the exclusive lock as
compatibility is FALSE.
● In simple words if transaction T1 is reading a data item A, then the same data item A can be
read by another transaction T2 but cannot be written by another transaction.
● Similarly, if an exclusive lock (i.e., lock for read and write operations) is held on the data item in
some transaction then no other transaction can acquire a Shared or Exclusive lock as the
compatibility function denoted FALSE.

SHARED LOCK (S) :-

If any application acquires a SHARED LOCK on a page, then it can read that page but cannot
update it. Other concurrent applications can acquire the SHARED or UPDATE lock on the same
page.
● Another transaction that tries to read the same data is permitted to read, but a transaction that
tries to update the data will be prevented from doing so until the shared lock is released.
● Shared lock is also called read lock, used for reading data items only.
● Shared locks support read integrity. They ensure that a record is not in process of being
updated during a read-only request.
● Shared locks can also be used to prevent any kind of updates of record.
● It is denoted by Lock-S.
● S-lock is requested using Lock-S instruction.
For example, consider a case where initially A=100 and there are two transactions which are
reading A. If one of the transactions wants to update A, in that case, the other transaction
would be reading the wrong value. However, the Shared lock prevents it from updating until it
has finished reading.
A shared lock is applied during the execution of the select query. A shared lock is the same as
an updated lock, but the difference is the locking time duration. The shared lock has scope only
with the select query.

TYPES OF LOCKS
● When a record has a shared lock then other transactions cannot update or delete the record.
Other transactions only can select the record.
● Update lock is only applied with the shared lock if there is no shared lock then update lock
cannot be applied.

DIFFERENCE BETWEEN SHARED LOCK AND EXCLUSIVE LOCK :-


S.No. Shared Lock Exclusive Lock

Lock mode is read as well as write


1. Lock mode is a read-only operation. operation.

The shared lock can be placed on objects Exclusive locks can only be placed on
that do not have an exclusive lock already objects that do not have any other kind
2. placed on them. of lock.

Prevents others from reading or


3. Prevents others from updating the data. updating the data.

Issued when the transaction wants to read Issued when transaction wants to
4. items that do not have an exclusive lock. update unlocked item.

Any number of transactions can hold a Exclusive lock can be held by only one
5. shared lock on an item. transaction.

X-lock is requested using lock-X


6. S-lock is requested using lock-S instruction. instruction.

KEY TAKEAWAYS :-

The main difference between a shared lock and an exclusive lock lies in the level of access they
provide to the locked resource and the operations they permit for concurrent transactions.

Shared Lock :-

Purpose: Allows multiple transactions to read a resource simultaneously without permitting any
of them to modify it.
Read Access: Permitted. Multiple transactions holding shared locks can read the resource
concurrently.
Write Access: Not permitted. Transactions holding shared locks cannot modify the resource.
Concurrency: Supports high concurrency for read operations. Multiple transactions can acquire
shared locks on the same resource simultaneously.

TYPES OF LOCKS
Exclusive Lock :-

Purpose: Grants exclusive access to a resource, preventing other transactions from both
reading and writing to it.
Read Access: Not permitted. Transactions holding exclusive locks can neither read nor share the
resource with other transactions.
Write Access: Permitted. The transaction holding the exclusive lock has the exclusive right to
modify the resource.
Concurrency: Limits concurrency, as only one transaction at a time can hold an exclusive lock on
a given resource. This ensures that modifications are serialized and avoid conflicts.
In summary, a shared lock is used when multiple transactions need to read a resource
concurrently but should not modify it. It allows for high concurrency in read operations.

TYPES OF LOCKS
TYPES OF LOCKS

TYPES OF LOCKS
SUB LESSON 15.3

DEADLOCK

DEAD LOCK(S) :-

➢ Deadlock is a situation where two transactions block the progress of each other. It is a
problematic situation.
➢ In this article, let’s have a look at what is a deadlock and how we can avoid it. In the following
diagram shown Transaction 1 hold the Resource 1 and Transaction 2 holds the Resource 2 after
this execution transaction 1 request for resource 2 and transaction 2 request for resource 1,
they do not share with each other here deadlock situation has occurred.

Steps to identify the deadlock as follows -

TYPES OF LOCKS
• See in the following example there are two transactions try to update the same records
1 and 2 in different order of execution.
• When they are executed, it shows the following error message because there is a
deadlock situation.
• SQL Server took a call and hi kills Tran2 and execute Tran1.
• The first 10 seconds it will execute but after that Trans2 checks the access for 1 record
but its access has exclusively lock by Tran1 and Tran1 check access for 2 records but it is
exclusively locked by Tran2.

COFFMAN CONDITIONS:-

Regarding deadlock in DBMS, there were four conditions stated by Coffman. A deadlock might
occur if all of these four Coffman conditions hold true at any point in time.
Mutual Exclusion: There should be at least one resource that cannot be utilized by more than
one transaction at a time.
Hold and wait condition: When any transaction is holding a resource, requests for some more
additional resources which are already being held by some other transactions in the system.
No pre-emption condition: Access to a particular resource can ever be forcibly taken from a
running transaction. Only that running transaction can release a resource that is being held by
it.
Circular wait condition: In this condition, a transaction is kept waiting for a resource that is at
the same time is held by some other transaction and which is further waiting for a third
transaction so on and the last transaction is waiting for the very first one. Thus, giving rise to a
circular chain of waiting transactions.
Methods for handling deadlock there are three ways to handle deadlock.

DEADLOCK PREVENTION OR AVOIDANCE :-

Prevention :-
The idea is to not let the system into a deadlock state. This system will make sure that above
mentioned four conditions will not arise. These techniques are very costly so we use this in
cases where our priority is making a system deadlock-free.

TYPES OF LOCKS
One can zoom into each category individually; Prevention is done by negating one of above-
mentioned necessary conditions for deadlock. Prevention can be done in four different ways:
1. Eliminate mutual exclusion 3. Allow pre-emption.
2. Solve hold and Wait 4. Circular wait
Solution
Avoidance
Avoidance is kind of futuristic. By using the strategy of “Avoidance”, we have to make an
assumption. We need to ensure that all information about resources that the process will need
is known to us before the execution of the process. We use Banker’s algorithm (Which is in turn
a gift from Dijkstra) to avoid deadlock.
In prevention and avoidance, we get correctness of data but performance decreases.

DEADLOCK DETECTION AND RECOVERY :-

If Deadlock prevention or avoidance is not applied to the software then we can handle this by
deadlock detection and recovery. Which consist of two phases.
1. In the first phase, we examine the state of the process and check whether there is a
deadlock or not in the system.
2. If found deadlock in the first phase, then we apply the algorithm for recovery of the
deadlock. In Deadlock detection and recovery, we get the correctness of data but performance
decreases.
3. Deadlock ignorance: If a deadlock is very rare, then let it happen and reboot the system.
This is the approach that both Windows and UNIX take. we use the ostrich algorithm for
deadlock ignorance.
In Deadlock, ignorance performance is better than the above two methods but the correctness
of data.

TYPES OF LOCKS
KEY TAKEAWAYS:-

➢ Deadlock is an undesired state that brings the whole system to a halt as no task ever gets
finished and is in a waiting state forever. If any of the transactions can lead to a deadlock then
that transaction is never executed.
➢ There are 4 Coffman conditions out of which if one or more are true, then there might occur a
deadlock in the system.
➢ Deadlock handling and its avoidance are methods to deal with the situation, while the Wait-die
and Wait-wound schemes are the two prominent ways of preventing a deadlock.
➢ Since Distributed systems are very vast and highly scalable, it is impossible to prevent or avoid a
deadlock. Therefore, three approaches that include Centralized, Distributed, and Hierarchical
are discussed for detecting deadlocks in such systems.

TYPES OF LOCKS
CONCURRENCY CONTROL

CONCURRENCY CONTROL
SUB LESSON 16.1

METHODS FOR CONCURRENCY CONTROL

DBMS CONCURRENCY CONTROL :-

➢ Concurrency Control is the management procedure that is required for controlling concurrent
execution of the operations that take place on a database.
➢ But before knowing about concurrency control, we should know about concurrent execution.
➢ Concurrent Execution in DBMS
➢ In a multi-user system, multiple users can access and use the same database at one time, which
is known as the concurrent execution of the database. It means that the same database is
executed simultaneously on a multi-user system by different users.
➢ While working on the database transactions, there occurs the requirement of using the
database by multiple users for performing different operations, and in that case, concurrent
execution of the database is performed.
➢ The thing is that the simultaneous execution that is performed should be done in an interleaved
manner, and no operation should affect the other executing operations, thus maintaining the
consistency of the database. Thus, on making the concurrent execution of the transaction
operations, there occur several challenging problems that need to be solved.

Concurrency control provided in a database to:

(i) Enforce isolation among transactions.


(ii) Preserve database consistency through consistency preserving execution of transactions.
(iii) Resolve read-write and write-read conflicts.

VARIOUS CONCURRENCY CONTROL TECHNIQUES ARE :-

CONCURRENCY CONTROL
1. Two-phase locking Protocol
2. Time stamp ordering Protocol
3. Multi version concurrency control
4. Validation concurrency control

TWO-PHASE LOCKING PROTOCOL :-

Locking is an operation which secures permission to read, OR permission to write a data item.
Two phase locking is a process used to gain ownership of shared resources without creating the
possibility of deadlock. The 3 activities taking place in the two-phase update algorithm are:

(i). Lock Acquisition


(ii). Modification of Data
(iii). Release Lock

Two phase locking prevents deadlock from occurring in distributed systems by releasing all the
resources it has acquired, if it is not possible to acquire all the resources required without
waiting for another process to finish using a lock. This means that no process is ever in a state
where it is holding some shared resources, and waiting for another process to release a shared
resource which it requires. This means that deadlock cannot occur due to resource contention.
A transaction in the Two-Phase Locking Protocol can assume one of the 2 phases:

● (i) Growing Phase: In this phase a transaction can only acquire locks but cannot release any
lock. The point when a transaction acquires all the locks it needs is called the Lock Point.
● (ii) Shrinking Phase: In this phase a transaction can only release locks but cannot acquire any.

TIME STAMP ORDERING PROTOCOL :-

CONCURRENCY CONTROL
A timestamp is a tag that can be attached to any transaction or any data item, which denotes a
specific time on which the transaction or the data item had been used in any way. A timestamp
can be implemented in 2 ways. One is to directly assign the current value of the clock to the
transaction or data item. The other is to attach the value of a logical counter that keeps
increment as new timestamps are required. The timestamp of a data item can be of 2 types:
● (i) W-timestamp(X): This means the latest time when the data item X has been written into.
● (ii) R-timestamp(X): This means the latest time when the data item X has been read from.
These 2 timestamps are updated each time a successful read/write operation is performed on
the data item X.

MULTI VERSION CONCURRENCY CONTROL :-

Multi version schemes keep old versions of data item to increase concurrency. Multiversion 2
phase locking: Each successful write results in the creation of a new version of the data item
written. Timestamps are used to label the versions. When a read(X) operation is issued, select
an appropriate version of X based on the timestamp of the transaction.

VALIDATION CONCURRENCY CONTROL :-

The optimistic approach is based on the assumption that the majority of the database
operations do not conflict. The optimistic approach requires neither locking nor time stamping
techniques. Instead, a transaction is executed without restrictions until it is committed. Using
an optimistic approach, each transaction moves through 2 or 3 phases, referred to as read,
validation and write.
(i) During read phase, the transaction reads the database, executes the needed computations
and makes the updates to a private copy of the database values. All update operations of the

CONCURRENCY CONTROL
transactions are recorded in a temporary update file, which is not accessed by the remaining
transactions.
(ii) During the validation phase, the transaction is validated to ensure that the changes made
will not affect the integrity and consistency of the database. If the validation test is positive, the
transaction goes to a write phase. If the validation test is negative, he transaction is restarted
and the changes are discarded.
(iii) During the write phase, the changes are permanently applied to the database.

KEY TAKEAWAYS :-
Concurrency control in a Database Management System (DBMS) is the process of managing
simultaneous access to the database by multiple transactions to ensure data consistency,
isolation, and integrity. In a multi-user environment, several transactions may attempt to access
and modify the same data concurrently, which can lead to various issues, such as lost updates,
uncommitted data, and inconsistency. Concurrency control mechanisms help prevent these
problems and maintain the integrity of the database.

CONCURRENCY CONTROL
CONCURRENCY CONTROL

CONCURRENCY CONTROL
SUB LESSON 16.2

LOCKING METHODS

LOCKING TECHNIQUES IN DBMS :-

In the previous article, we had read that, transactions should be in a serializable schedule. But,
testing for Serialization is computationally expensive and impractical. Hence, in order to ensure
the correct execution of concurrent transactions, a number of concurrency control
techniques are applied in DBMS. One of them is use of Locking Techniques.
Concurrency Control Schemes are the schemes which are used by DBMS to ensure that
concurrent transactions do not interfere with each other. Some of these Locking Techniques
are:
● Locking
● Time Stamps, and
● Optimistic Protocols.
Here in this article, we will discuss about locking.

INTRODUCTION TO LOCKING :-

Locking is one of the most commonly used concurrency control schemes in DBMS. This works
by associating a variable lock on the data items. This variable describes the status of the data
item with respect to the possible operations that can be applied on it. The value of this variable
is used to control the concurrent access and the manipulation of the associated data item.
The concurrency control technique in which the value of the lock variable is manipulated is
called locking. The technique of locking is one way to ensure Serializability in DBMS.
In DBMS, locking is the responsibility of a subsystem called lock manager.
Types of Locking Techniques

CONCURRENCY CONTROL
To control concurrency there are various types of locks which can be applied in DBMS.

Locks are usually issued by the transactions before any operations are performed by them.

BINARY LOCKS :-

A binary lock has two states or values associated with each data item. These values are:
1. Locked – 1
2. Unlocked – 0
If a data item is locked, then it cannot be accessed by other transactions i.e., other transactions
are forced to wait until the lock is released by the previous transaction.
But, if a data item is in the unlocked state, then, it can be accessed by any transaction and on
access the lock value is set to locked state.
These locks are applied and removed using Lock () and Unlock () operation respectively.
In binary locks, at a particular point in time, only one transaction can hold a lock on the data
item. No other transaction will be able to access the same data concurrently. Hence, Binary
locks are very simple to apply but are not used practically.
Shared / Exclusive Locks
In shared locks, multiple users are allowed to access the same data item with a read lock which
is shared by them. But, in case when a transaction needs to write a data item, then
an exclusive lock is applied on that data item. So here, we classify the locks as:
● Shared Locks
● Exclusive Locks

CONCURRENCY CONTROL
SHARED LOCKS:-

Shared locks are applied to a data item when the transaction requests a read operation on the
data item. A shared lock will allow multiple transactions to only read the data
item concurrently.
As these locks are applied on read operation, they will not compromise on the consistency of
the database.
Exclusive Locks
Exclusive locks on the other hand are applied on the transactions which request
a write operation on the data item.
The transaction which is modifying the data item requests an exclusive lock on the data item
and hence any other transaction which needs access to the data item has to wait until the lock
applied by the previous transaction has been released by it.
But when exclusive locks are applied there are situations when a transaction enters into a wait
state indefinitely. Such a state where a transaction cannot come out of the wait state is known
as a deadlock.

TWO PHASE LOCKING :-

The Two Phase Locking Techniques guarantee Serializability in DBMS. A transaction is said to
follow Two Phase Locking Protocol if all locking operations in the transaction precede the first
unlock operation.
In this, locks are applied in two phases:
● Growing Phase
● Shrinking Phase
Growing Phase
This phase is also known as the first phase or the expanding phase. It is in this phase that the
transaction acquires all the locks needed by it but it cannot release any locks here.
Shrinking Phase This phase is also known as the second phase or the contracting phase. Here a
transaction is not allowed to acquire any new locks but it can release the existing locks it

CONCURRENCY CONTROL
holds. The Two Phase Locking Protocol helps solve problems of lost update, inconsistent
analysis or dirty read too.

KEY TAKEAWAYS:-

Locks in a Database Management System (DBMS) offer several advantages when it comes to
managing concurrent access to shared resources. Here are some of the key advantages of using
locks:
Data Consistency :- Locks help maintain data consistency by preventing multiple transactions
from accessing or modifying the same resource simultaneously. This ensures that conflicting
operations are serialized, avoiding potential inconsistencies in the database.
Isolation of Transactions :- Locks provide a mechanism to isolate transactions from each other.
By preventing concurrent access to certain resources, locks help ensure that the changes made
by one transaction are not visible to other transactions until the first transaction is committed.
Prevention of Lost Updates :- Shared and exclusive locks prevent the issue of lost updates. A
shared lock allows multiple transactions to read a resource concurrently, but only one
transaction with an exclusive lock can modify the resource at a time. This prevents multiple
transactions from overwriting each other's changes.
Deadlock Prevention and Resolution :- Locks help in detecting and resolving deadlocks, which
occur when two or more transactions are waiting for each other to release locks. DBMS systems
often have deadlock detection and resolution mechanisms to ensure that the system can
recover from such situations.
Enforcement of Integrity Constraints :- Locks can be used to enforce integrity constraints by
ensuring that transactions follow predefined rules. For example, an exclusive lock can be
applied to a resource to prevent other transactions from accessing it while a transaction is
modifying the data, ensuring that the data remains consistent.
Control Over Access to Critical Sections :- Locks allow for controlled access to critical sections
of the database. For operations that require exclusive access, such as updates or deletions, an

CONCURRENCY CONTROL
exclusive lock can be used to ensure that only one transaction at a time can perform these
operations.
Facilitation of Rollback :- In the event of an error or a transaction needing to be rolled back,
locks provide a straightforward way to undo the changes made by the transaction. A rollback
operation typically releases all locks held by the transaction.
Predictable Behavior :- Lock-based concurrency control provides a predictable and well-defined
behavior for transactions. Transactions follow a specific order and acquire/release locks in a
controlled manner, making it easier to reason about the system's behavior.

While locks offer several advantages, it's essential to use them judiciously to avoid potential
downsides such as reduced concurrency and the risk of deadlocks. Database designers need to
strike a balance between ensuring data consistency and allowing for a reasonable level of
concurrent access to optimize system performance.

CONCURRENCY CONTROL
CONCURRENCY CONTROL
SUB LESSON 16.3

TIMESTAMP METHODS

TIMESTAMP METHODS:-

Concurrency Control can be implemented in different ways. One way to implement it is by


using locks.
Timestamp is a unique identifier created by the DBMS to identify a transaction. They are
usually assigned in the order in which they are submitted to the system. Refer to the timestamp
of a transaction T as TS (T). For the basics of Timestamp,

Timestamp Ordering Protocol

The main idea for this protocol is to order the transactions based on their Timestamps. A
schedule in which the transactions participate is then serializable and the only equivalent serial
schedule permitted has the transactions in the order of their Timestamp Values. Stating simply,
the schedule is equivalent to the particular Serial Order corresponding to the order of the
Transaction timestamps. An algorithm must ensure that, for each item accessed by Conflicting
Operations in the schedule, the order in which the item is accessed does not violate the
ordering. To ensure this, use two Timestamp Values relating to each database item X.

● W_TS(X) is the largest timestamp of any transaction that executed write(X) successfully.
● R_TS(X) is the largest timestamp of any transaction that executed read(X) successfully.

Basic Timestamp Ordering :-

Every transaction is issued a timestamp based on when it enters the system. Suppose if an old
transaction Ti has timestamp TS(Ti), a new transaction Tj is assigned timestamp TS(Tj) such
that TS(Ti) < TS(Tj). The protocol manages concurrent execution such that the timestamps
determine the serializability order. The timestamp ordering protocol ensures that any
conflicting read and write operations are executed in timestamp order. Whenever some
Transaction T tries to issue a R_item(X) or a W_item(X), the Basic TO algorithm compares the
timestamp of T with R_TS(X) & W_TS(X) to ensure that the Timestamp order is not violated.
This describes the Basic TO protocol in the following two cases.
1. Whenever a Transaction T issues a W_item(X) operation, check the following conditions:
● If R_TS(X) > TS (T) or if W_TS(X) > TS (T), then abort and rollback T and reject the operation.
else,
● Execute W_item(X) operation of T and set W_TS(X) to TS (T).
2. Whenever a Transaction T issues a R_item(X) operation, check the following conditions:
● If W_TS(X) > TS(T), then abort and reject T and reject the operation, else
● If W_TS(X) <= TS(T), then execute the R_item(X) operation of T and set R_TS(X) to the larger of
TS(T) and current R_TS(X).

Whenever the Basic TO algorithm detects two conflicting operations that occur in an incorrect
order, it rejects the latter of the two operations by aborting the Transaction that issued it.
Schedules produced by Basic TO are guaranteed to be conflict serializable. Already discussed
that using Timestamp can ensure that our schedule will be Deadlock free.
One drawback of the Basic TO protocol is that Cascading Rollback is still possible. Suppose we
have a Transaction T1 and T2 has used a value written by T1. If T1 is aborted and resubmitted to
the system then, T2 must also be aborted and rolled back. So the problem of Cascading aborts
still prevails.

Let’s gist the Advantages and Disadvantages of Basic TO protocol:

● Timestamp Ordering protocol ensures serializability since the precedence graph will be of the
form:
Image – Precedence Graph for TS ordering
● Timestamp protocol ensures freedom from deadlock as no transaction ever waits.
● But the schedule may not be cascade free and may not even be recoverable.

Strict Timestamp Ordering –


A variation of Basic TO is called Strict TO ensure that the schedules are both Strict and Conflict
Serializable. In this variation, a Transaction T that issues a R_item(X) or W_item(X) such that
TS(T) > W_TS(X) has its read or write operation delayed until the Transaction T‘that wrote the
values of X has committed or aborted.

Advantages and Disadvantages of TO protocol:


● TO protocol ensures serializability since the precedence graph is as follows:
● TS protocol ensures freedom from deadlock that means no transaction ever waits.
● But the schedule may not be recoverable and may not even be cascade- free.

KEY TAKEAWAYS :-

In database management systems (DBMS), timestamp methods are used to manage and
control concurrent access to data by multiple transactions. These methods help in maintaining
the consistency and integrity of the database when multiple transactions are executed
simultaneously. Here are some commonly used timestamp methods in DBMS :-

Timestamp Ordering Protocol :- Each transaction is assigned a unique timestamp that reflects
the order of its execution. Transactions are ordered based on their timestamps, and this order
is used to determine the serialization of conflicting transactions.
When a transaction wants to read or write a data item, the timestamp of the transaction
requesting the operation is compared with the timestamp of the last operation on that item.
This comparison determines whether the operation is valid or not.

Thomas Write Rule :- This rule is an extension of the timestamp ordering protocol.
It enforces strict timestamp ordering for write operations. If a transaction T1 with a higher
timestamp writes a data item that has been previously written by a transaction T2 with a lower
timestamp, the system ensures that T2's write operation is rolled back.

Multi version Timestamp Ordering :- In this method, multiple versions of a data item are
maintained to allow for concurrent read and write operations.
Each version is associated with a timestamp, and transactions are allowed to read the version
of the data item that has a timestamp less than or equal to the transaction's timestamp.
This method helps in reducing conflicts between transactions and allows for a higher degree of
concurrency.
➢ Concurrency Control Using Interval Timestamps :- Transactions are assigned intervals or ranges
of timestamps rather than a single timestamp.
Each data item is associated with an interval indicating the times during which it can be read or
written.

Transactions can only read or write data items if their timestamp falls within the allowed
interval for that item.
Validation-based Timestamp Ordering:
In this method, transactions are assigned timestamps as usual, but the actual execution of a
transaction is delayed until it is validated.
Validation involves checking if the transactions that have committed before the current
transaction's timestamp have caused any conflicts.
If conflicts are detected, the transaction is rolled back; otherwise, it is allowed to execute.
These timestamp methods help in ensuring the serializability of transactions and preventing
conflicts in a multi-user database environment. The choice of a specific method depends on
factors such as the system requirements, workload characteristics, and desired level of
concurrency.
CONCURRENCY CONTROL

CONCURRENCY CONTROL
SUB LESSON 16.4

OPTIMISTIC METHOD

WHAT IS OPTIMISTIC METHOD ?

➢ Most current approaches to concurrency control in database systems rely on locking of data
objects as a control mechanism. In this paper, two families of nonlocking concurrency controls
are presented. The methods used are “optimistic” in the sense that they rely mainly on
transaction backup as a control mechanism, “hoping” that conflicts between transactions will
not occur. Applications for which these methods should be more efficient than locking are
discussed.

➢ All data items are updated at the end of the transaction, at the end, if any data item is found
inconsistent with respect to the value in, then the transaction is rolled back.
➢ Check for conflicts at the end of the transaction. No checking while the transaction is executing.
Checks are all made at once, so low transaction execution overhead. Updates are not applied
until end-transaction. They are applied to local copies in a transaction space.

READ PHASE :-

Various data items are read and stored in temporary variables (local copies). All operations are
performed in these variables without updating the database.

VALIDATION PHASE :-

All concurrent data items are checked to ensure serializability will not be validated if the
transaction updates are actually applied to the database. Any changes in the value cause the
transaction rollback. The transaction timestamps are used and the write-sets and read-sets are
maintained.

CONCURRENCY CONTROL
To check that transaction A does not interfere with transaction B the following must hold −
● TransB completes its write phase before TransA starts the read phase.
● TransA starts its write phase after TransB completes its write phase, and the read set of TransA
has no items in common with the write set of TransB.
● Both the read set and write set of TransA have no items in common with the write set of TransB
and TransB completes its read before TransA completes its read Phase.

WRITE PHASE:-

➢ The transaction updates applied to the database if the validation is successful. Otherwise,
updates are discarded and transactions are aborted and restarted. It does not use any locks
hence deadlock free, however starvation problems of data items may occur.

WEB USAGE :-

➢ The stateless nature of HTTP makes locking infeasible for web user interfaces. It is common for
a user to start editing a record, then leave without following a "cancel" or "logout" link. If
locking is used, other users who attempt to edit the same record must wait until the first user's
lock times out.

➢ HTTP does provide a form of built-in OCC. The response to an initial GET request can include an
ETag for subsequent PUT requests to use in the If-Match header. Any PUT requests with an out-
of-date ETag in the If-Match header can then be rejected.[3]

➢ Some database management systems offer OCC natively, without requiring special application
code. For others, the application can implement an OCC layer outside of the database, and
avoid waiting or silently overwriting records. In such cases, the form may include a hidden field
with the record's original content, a timestamp, a sequence number, or an opaque token. On
submit, this is compared against the database. If it differs, the conflict resolution algorithm is
invoked.

CONCURRENCY CONTROL
Example:
S: W1(X), r2(Y), r1(Y), r2(X).
T1 -3
T2 – 4
Check whether timestamp ordering protocols allow schedule S.
Solution
Initially for a data-item X, RTS(X)=0, WTS(X)=0
Initially for a data-item Y, RTS(Y)=0, WTS(Y)=0

For W1(X) : TS(Ti)<RTS(X) i.e.


TS(T1)<RTS(X)
TS(T1)<WTS(X)
3<0 (FALSE)
=>goto else and perform write operation w1(X) and WTS(X)=3
For r2(Y): TS(T2)<WTS(Y)
4<0 (FALSE)
=>goto else and perform read operation r2(Y) and RTS(Y)=4
For r1(Y) :TS(T1)<WTS(Y)
3<0 (FALSE)
=>goto else and perform read operation r1(Y).
For r2(X) : TS(T2)<WTS(X)
4<3 (FALSE)
=>goto else and perform read operation r2(X) and RTS(X) =4

CONCURRENCY CONTROL
ADVANTAGES OF THE OPTIMISTIC METHOD:-

➢ Increased Concurrency: Optimistic concurrency control allows for a higher degree of


concurrency since transactions do not acquire locks during the read phase.
➢ Reduced Locking Overhead: There is no need for locking during the execution of transactions,
which can reduce the overhead associated with lock management.

DISADVANTAGES AND CONSIDERATIONS :-

➢ Potential for Rollbacks: Since conflicts are detected only at the validation phase, there is a
possibility that a transaction may need to be rolled back after it has performed a significant
amount of work.
➢ Increased Validation Overhead: The validation phase introduces additional overhead to check
for conflicts, especially in environments with high contention for resources.
➢ The choice between optimistic and pessimistic concurrency control methods depends on
factors such as the expected workload, system requirements, and the nature of transactions in
a given application.

KEY TAKEAWAYS :-

The optimistic method, also known as optimistic concurrency control, is an approach used in
database management systems (DBMS) to handle concurrent access to data by multiple
transactions. The optimistic method assumes that conflicts between transactions are rare, and
it allows transactions to proceed without acquiring locks on data items. Conflicts are detected
and resolved only at the time of transaction commitment. This method is based on the belief
that most transactions can be executed without interference, and conflicts are the exception
rather than the rule.

CONCURRENCY CONTROL
DATABASE SECURITY

DATABASE SECURITY 6
SUB LESSON 17.1

DATABASE SECURITY AND ITS ISSUES

WHAT IS DATABASE SECURITY ?

➢ Database security includes a variety of measures used to secure database management systems
from malicious cyber-attacks and illegitimate use. Database security programs are designed to
protect not only the data within the database, but also the data management system itself, and
every application that accesses it, from misuse, damage, and intrusion.
➢ Database security encompasses tools, processes, and methodologies which establish security
inside a database environment.

DATABASE SECURITY THREATS :-

Many software vulnerabilities, misconfigurations, or patterns of misuse or carelessness could


result in breaches. Here are a number of the most known causes and types of database security
cyber threats.

DATABASE SECURITY 7
INSIDER THREATS :-

An insider threat is a security risk from one of the following three sources, each of which has
privileged means of entry to the database:
• A malicious insider with ill-intent
• A negligent person within the organization who exposes the database to attack through
careless actions
• An outsider who obtains credentials through social engineering or other methods, or
gains access to the database’s credentials
An insider threat is one of the most typical causes of database security breaches and it often
occurs because a lot of employees have been granted privileged user access.

HUMAN ERROR :-

Weak passwords, password sharing, accidental erasure or corruption of data, and other
undesirable user behaviours are still the cause of almost half of data breaches reported.

EXPLOITATION OF DATABASE SOFTWARE VULNERABILITIES :-

Attackers constantly attempt to isolate and target vulnerabilities in software, and database
management software is a highly valuable target. New vulnerabilities are discovered daily, and
all open source database management platforms and commercial database software vendors
issue security patches regularly. However, if you don’t use these patches quickly, your database
might be exposed to attack.

Even if you do apply patches on time, there is always the risk of zero-day attacks, when
attackers discover a vulnerability, but it has not yet been discovered and patched by the
database vendor.

DATABASE SECURITY 8
SQL/NOSQL INJECTION ATTACKS :-

A database-specific threat involves the use of arbitrary non-SQL and SQL attack strings into
database queries. Typically, these are queries created as an extension of web application forms,
or received via HTTP requests. Any database system is vulnerable to these attacks, if developers
do not adhere to secure coding practices, and if the organization does not carry out regular
vulnerability testing.

BUFFER OVERFLOW ATTACKS :-

Buffer overflow takes place when a process tries to write a large amount of data to a fixed-
length block of memory, more than it is permitted to hold. Attackers might use the excess data,
kept in adjacent memory addresses, as the starting point from which to launch attacks.

DENIAL OF SERVICE (DOS/DDOS) ATTACKS :-

In a denial of service (DoS) attack, the cybercriminal overwhelms the target service—in this
instance the database server—using a large amount of fake requests. The result is that the
server cannot carry out genuine requests from actual users, and often crashes or becomes
unstable.

In a distributed denial of service attack (DDoS), fake traffic is generated by a large number of
computers, participating in a botnet controlled by the attacker. This generates very large traffic
volumes, which are difficult to stop without a highly scalable defensive architecture. Cloud-
based DDoS protection services can scale up dynamically to address very large DDoS attacks.

DATABASE SECURITY 9
MALWARE :-

Malware is software written to take advantage of vulnerabilities or to cause harm to a


database. Malware could arrive through any endpoint device connected to the database’s
network. Malware protection is important on any endpoint, but especially so on database
servers, because of their high value and sensitivity.

AN EVOLVING IT ENVIRONMENT :-

The evolving IT environment is making databases more susceptible to threats. Here are trends
that can lead to new types of attacks on databases, or may require new defensive measures:

GROWING DATA VOLUME :-

Storage, data capture, and processing is growing exponentially across almost all organizations.
Any data security practices or tools must be highly scalable to address distant and near-future
requirements

DISTRIBUTED INFRASTRUCTURE :-

Network environments are increasing in complexity, especially as businesses transfer workloads


to hybrid cloud or multi-cloud architectures, making the deployment, management, and choice
of security solutions more difficult.

INCREASINGLY TIGHT REGULATORY REQUIREMENTS :-

The worldwide regulatory compliance landscape is growing in complexity, so following all


mandates is becoming more challenging.

CYBER SECURITY SKILLS SHORTAGE :-

DATABASE SECURITY 10
There is a global shortage of skilled cybersecurity professionals, and organizations are finding it
difficult to fill security roles. This can make it more difficult to defend critical infrastructure,
including databases.

KEY TAKEAWAYS :-

Database security is a critical aspect of database management systems (DBMS) to ensure the
confidentiality, integrity, and availability of data. Protecting sensitive information from
unauthorized access, maintaining data accuracy, and preventing data loss are key objectives of
database security

DATABASE SECURITY 11
DATABASE SECURITY

FUNDAMENTALS OF MARKETING MANAGEMENT


SUB LESSON 17.2

GRANTING AND REVOKING PRIVILEGES

WHAT IS GRANTING AND REVOKING PRIVILEGES :-

➢ Granting and revoking privileges on modules is a task that you would perform when you want
to allow or disallow users of the database to be able to reference objects defined within the
module as part of a security practice.
➢ Review the GRANT (Module privileges) statement syntax and the Revoke (Module
privileges) statement syntax.
➢ The module must exist.
➢ Ensure that you have the authority to execute either the GRANT or REVOKE statement.
➢ Identify the authorization ID of the user, group, or role to be granted or revoked
privileges.

ABOUT :-

The EXECUTE privilege on a module can be granted or revoked. The EXECUTE privilege enables
users to perform many tasks including: execute published routines defined in the module, read
and write to published global variables, reference any published user-defined types.

PROCEDURE:-

➢ Formulate a GRANT or REVOKE statement:


➢ Specify the GRANT clause to grant privileges or the REVOKE clause to revoke privileges.
➢ Specify the EXECUTE ON MODULE clause to specify the privilege.
➢ Specify the name of the module.

FUNDAMENTALS OF MARKETING MANAGEMENT


➢ Specify the TO clause if granting the privilege or specify the FROM clause if revoking the
privilege.
➢ Specify the authorization ID of the user for whom the privileges are being changed.
➢ Execute the statement.

RESULTS :-

If the statement executes successfully, the privileges on the module will be updated.

Example
The following is an example of an SQL statement that grants the EXECUTE privilege on
module inventory to the authorization ID jones:

GRANT EXECUTE ON MODULE inventory TO jones@

The following is an example of an SQL statement that revokes the EXECUTE privilege on
module inventory from the authorization ID macdonald:

REVOKE EXECUTE ON MODULE inventory FROM macdonald@

2. Revoke:
Revoke command withdraw user privileges on database objects if any granted. It does
operations opposite to the Grant command. When a privilege is revoked from a particular user
U, then the privileges granted to all other users by user U will be revoked.
Syntax:

revoke privilege_name on object_name


from {user_name | public | role_name}

grant insert,

FUNDAMENTALS OF MARKETING MANAGEMENT


select on accounts to Ram
By the above command user ram has granted permissions on accounts database object like he
can query or insert into accounts.
revoke insert,
select on accounts from Ram
By the above command user ram’s permissions like query or insert on accounts database object
has been removed.

DIFFERENCES BETWEEN GRANT AND REVOKE COMMANDS:

S.NO Grant Revoke

This DCL command grants


permissions to the user on the This DCL command removes permissions if any
1 database objects. granted to the users on database objects.

FUNDAMENTALS OF MARKETING MANAGEMENT


S.NO Grant Revoke

2 It assigns access rights to users. It revokes the user access rights of users.

If access for one user is removed; all the


For each user you need to specify particular permissions provided by that users to
3 the permissions. others will be removed.

When the access is decentralized If decentralized access removing the granted


4 granting permissions will be easy. permissions is difficult.

KEY TAKEAWAYS
In database management systems (DBMS), the GRANT and REVOKE statements are used to
manage user privileges and permissions. These statements are part of the Data Control
Language (DCL) in SQL, which deals with the permissions and access control within a database.

Grant Syntax:
GRANT privilege_name
ON object_name
TO {user_name | PUBLIC | role_name}
[WITH GRANT OPTION];

Revoke Syntax:
REVOKE privilege_name
ON object_name
FROM {user_name | PUBLIC | role_name};

FUNDAMENTALS OF MARKETING MANAGEMENT


DATABASE SECURITY

DATABASE SECURITY 17
SUB LESSON 17.3

ROLE-BASED ACCESS CONTROL

DEFINITION OF ROLE-BASED ACCESS CONTROL (RBAC) :-

➢ Role-based access control (RBAC) restricts network access based on a person's role within an
organization and has become one of the main methods for advanced access control. The roles
in RBAC refer to the levels of access that employees have to the network.
➢ The concept of Role-based Access Control is to create a set of permissions and assign these
permissions to a user or group. With the help of these permissions, only limited access to users
can be provided therefore level of security is increased.
➢ Employees are only allowed to access the information necessary to effectively perform their job
duties. Access can be based on several factors, such as authority, responsibility, and job
competency. In addition, access to computer resources can be limited to specific tasks such as
the ability to view, create, or modify a file.
➢ As a result, lower-level employees usually do not have access to sensitive data if they do not
need it to fulfil their responsibilities. This is especially helpful if you have many employees and
use third parties and contractors that make it difficult to closely monitor network access. Using
RBAC will help in securing your company’s sensitive data and important applications.

EXAMPLES OF ROLE-BASED ACCESS CONTROL :-

Through RBAC, you can control what end-users can do at both broad and granular levels. You
can designate whether the user is an administrator, a specialist user, or an end-user, and align
roles and access permissions with your employees’ positions in the organization. Permissions
are allocated only with enough access as needed for employees to do their jobs.

DATABASE SECURITY 18
What if an end-user's job changes? You may need to manually assign their role to another user,
or you can also assign roles to a role group or use a role assignment policy to add or remove
members of a role group.
Some of the designations in an RBAC tool can include:
➢ Management role scope – it limits what objects the role group is allowed to manage.
➢ Management role group – you can add and remove members.
➢ Management role – these are the types of tasks that can be performed by a specific role
group.
➢ Management role assignment – this links a role to a role group.
By adding a user to a role group, the user has access to all the roles in that group. If they are
removed, access becomes restricted. Users may also be assigned to multiple groups in the
event they need temporary access to certain data or programs and then removed once the
project is complete.
Other options for user access may include:
➢ Primary – the primary contact for a specific account or role.
➢ Billing – access for one end-user to the billing account.
➢ Technical – assigned to users that perform technical tasks.
➢ Administrative – access for users that perform administrative tasks.

BENEFITS OF RBAC :-

Managing and auditing network access is essential to information security. Access can and
should be granted on a need-to-know basis. With hundreds or thousands of employees,
security is more easily maintained by limiting unnecessary access to sensitive information based
on each user’s established role within the organization. Other advantages include:
1. Reducing administrative work and IT support. With RBAC, you can reduce the need for
paperwork and password changes when an employee is hired or changes their role. Instead,
you can use RBAC to add and switch roles quickly and implement them globally across
operating systems, platforms and applications. It also reduces the potential for error when
assigning user permissions. This reduction in time spent on administrative tasks is just one of

DATABASE SECURITY 19
several economic benefits of RBAC. RBAC also helps to more easily integrate third-party users
into your network by giving them pre-defined roles.
2. Maximizing operational efficiency. RBAC offers a streamlined approach that is logical in
definition. Instead of trying to administer lower-level access control, all the roles can be aligned
with the organizational structure of the business and users can do their jobs more efficiently
and autonomously.
3. Improving compliance. All organizations are subject to federal, state and local
regulations. With an RBAC system in place, companies can more easily meet statutory and
regulatory requirements for privacy and confidentiality as IT departments and executives have
the ability to manage how data is being accessed and used. This is especially significant for
health care and financial institutions, which manage lots of sensitive data such as PHI and PCI
data.

KEY TAKEAWAYS:-

➢ Role-Based Access Control (RBAC) is a security model that is commonly used in database
management systems (DBMS) to manage and control access to data. In the context of
multilevel security, RBAC can be extended to ensure that users have access to information at
their security clearance level and prevent unauthorized access to sensitive data.
➢ It's important to note that the specific implementation of RBAC for multilevel security may vary
depending on the requirements of the organization, the DBMS in use, and any relevant
regulatory or compliance considerations. Regular assessments and updates to security policies
are crucial to adapting to changes in the organization's structure and security landscape.

DATABASE SECURITY 20
BACKUP AND RECOVERY IN DATABASE

BACKUP AND RECOVERY IN DATABASE 21


SUB LESSON 18.1

DATABASE BACKUP AND RECOVERY CONCEPTS

DATABASE BACKUP AND RECOVERY:-

➢ It is imperative to have a backup of the database in case the original is corrupted or lost for any
reason. Using this backup, the database can be recovered as it was before the failure.
➢ Database backup basically means that a duplicate of the database information and data is
created and stored in the backup server just to be on the safe side. Transaction logs are also
stored in the backup along with the database data because, without them, the data would be
useless.

REASONS FOR FAILURE IN A DATABASE :-

There can be multiple reasons for failure in a database because of which a database backup and
recovery plan is required. Some of these reasons are:
User Error - Normally, user error is the biggest reason for data destruction or corruption in a
database. To rectify the error, the database needs to be restored to the point in time before the
error occurred.
Hardware Failure - This can also lead to loss of data in a database. The database is stored on
multiple hard drives across various locations. These hard drives may sometimes malfunction
leading to database corruption. So, it is important to periodically change them.
Catastrophic Event - A catastrophic event can be a natural calamity like a flood or earthquake
or deliberate sabotage such as hacking of the database. Either way, the database data may be
corrupted, and backup may be required.

BACKUP AND RECOVERY IN DATABASE 22


BACKUP :-

Backup refers to storing a copy of original data which can be used in case of data loss. Backup is
considered one of the approaches to data protection. Important data of the organization needs
to be kept in backup efficiently for protecting valuable data. Backup can be achieved by storing
a copy of the original data separately or in a database on storage devices. There are various
types of backups available like full backup, incremental backup, Local backup, mirror backup,
etc. An example of a Backup can be SnapManager which makes a backup of everything in the
database.
Data backup is the practice of copying data from a primary to a secondary location, to protect it
in case of a disaster, accident or malicious action. Data is the lifeblood of modern organizations,
and losing data can cause massive damage and disrupt business operations. This is why backing
up your data is critical for all businesses, large and small.

METHODS OF BACKUP :-

The different methods of backup in a database are:


Full Backup - This method takes a lot of time as the full copy of the database is made including
the data and the transaction records.
Transaction Log - Only the transaction logs are saved as the backup in this method. To keep the
backup file as small as possible, the previous transaction log details are deleted once a new
backup record is made.
Differential Backup - This is similar to full backup in that it stores both the data and the
transaction records. However, only that information is saved in the backup that has changed
since the last full backup. Because of this, differential backup leads to smaller files.

BACKUP AND RECOVERY IN DATABASE 23


RECOVERY :-

Recovery refers to restoring lost data by following some processes. Even if the data was backed
up still lost so it can be recovered by using/implementing some recovery techniques. When a
database fails due to any reason then there is a chance of data loss, so in that case recovery
process helps in improve the reliability of the database. An example of Recover can be Snap
Manager which recovers the data from the last transaction.

DATABASE RECOVERY :-

There are two methods that are primarily used for database recovery. These are:
Log-based recovery - In log-based recovery, logs of all database transactions are stored in a
secure area so that in case of a system failure, the database can recover the data. All log
information, such as the time of the transaction; its data etc. should be stored before the
transaction is executed.
Shadow paging - In shadow paging, after the transaction is completed, its data is automatically
stored for safekeeping. So, if the system crashes in the middle of a transaction, changes made
by it will not be reflected in the database.

KEY TAKEAWAYS

BACKUP STRATEGIES:

Full Backups:
➢ A full backup involves copying the entire database at a specific point in time. It provides a
complete snapshot of the database, making it suitable for a comprehensive recovery.
➢ Full backups are typically scheduled periodically, such as daily or weekly, depending on the data
volatility and recovery point objectives.
Incremental Backups:

BACKUP AND RECOVERY IN DATABASE 24


➢ Incremental backups capture only the changes made to the database since the last backup,
whether it was a full backup or a previous incremental backup.
➢ Incremental backups are useful for reducing backup times and storage requirements, but the
recovery process may involve restoring multiple backup sets.
Differential Backups:
➢ Differential backups capture changes made to the database since the last full backup. Unlike
incremental backups, differential backups accumulate changes relative to the last full backup,
not the last backup of any type.
➢ Restoring from a differential backup may be faster than restoring from incremental backups.
Snapshot Backups:
➢ Some modern database systems support snapshot backups, which create a point-in-time
snapshot of the entire database. This can be used for creating consistent backups without
locking the database.
Online and Offline Backups:
➢ Online backups can be performed while the database is operational. This minimizes downtime
but may require additional precautions to ensure data consistency.
➢ Offline backups involve taking the database offline during the backup process, ensuring a
consistent snapshot of the entire database.
Recovery Strategies:
Point-in-Time Recovery:
➢ Point-in-time recovery allows the database to be restored to a specific point in time, typically
just before a critical event or data corruption occurred.
➢ This strategy is useful for recovering from human errors or data corruption while minimizing
data loss.
Rollback and Roll Forward Recovery:
➢ Rollback recovery involves undoing transactions to restore the database to a previous state.
This is typically used for reverting changes made by erroneous transactions.
➢ Rollforward recovery involves applying transactions from a log to bring the database to a
desired state. This is useful for recovering from a backup to the current point in time.

BACKUP AND RECOVERY IN DATABASE 25


Recovery from Backup Copies:
➢ In the event of data loss or corruption, recovery involves restoring the database from a backup.
Depending on the backup strategy, this may involve restoring a full backup, followed by
incremental or differential backups to bring the database up to the desired point in time.
Transaction Log Management:
➢ Transaction logs are crucial for recovery. Regularly back up and archive transaction logs to
ensure a complete and consistent record of changes to the database.
➢ Some recovery scenarios may involve replaying transaction logs to restore the database to a
specific point in time.
Testing and Validation:
➢ Regularly test the backup and recovery process to ensure its reliability. This includes simulating
various failure scenarios to validate the effectiveness of the recovery procedures.
Offsite and Cloud Backups:
➢ Store backup copies offsite or in the cloud to protect against physical disasters, such as fire or
flood, that could impact the primary data center.
➢ A well-designed backup and recovery strategy is a fundamental aspect of database
administration, providing the means to recover from various types of failures or disasters and
ensuring data availability and integrity. The specific approach may vary based on the
characteristics of the database, the recovery objectives, and the available infrastructure.
Regular monitoring, testing, and updating of backup procedures are essential to maintaining a
robust recovery capability.

BACKUP AND RECOVERY IN DATABASE 26


Backup and recovery in database

BACKUP AND RECOVER IN DATABASE 27


SUB LESSON 18.2

DATABASE BACKUP AND RECOVERY TECHNIQUES

DATABASE BACKUP AND RECOVERY TECHNIQUES :-

➢ Database systems, like any other computer system, are subject to failures but the data stored
in them must be available as and when required. When a database fails it must possess the
facilities for fast recovery. It must also have atomicity i.e., either transactions are completed
successfully and committed (the effect is recorded permanently in the database) or the
transaction should have no effect on the database. There are both automatic and non-
automatic ways for both, backing up data and recovery from any failure situations. The
techniques used to recover lost data due to system crashes, transaction errors, viruses,
catastrophic failure, incorrect command execution, etc. are database recovery techniques. So,
to prevent data loss recovery techniques based on deferred updates and immediate updates or
backing up data can be used. Recovery techniques are heavily dependent upon the existence of
a special file known as a system log. It contains information about the start and end of each
transaction and any updates which occur during the transaction. The log keeps track of all
transaction operations that affect the values of database items. This information is needed to
recover from transaction failure.

➢ Checkpoint :- Checkpoint is a mechanism where all the previous logs are removed from the
system and stored permanently in a storage disk. Checkpoint declares a point before which the
DBMS was in a consistent state, and all the transactions were committed.
➢ Undoing – If a transaction crashes, then the recovery manager may undo transactions i.e.
reverse the operations of a transaction. This involves examining a transaction for the log entry
write_item(T, x, old_value, new_value) and setting the value of item x in the database to old-
value. There are two major techniques for recovery from non-catastrophic transaction failures:
deferred updates and immediate updates.

BACKUP AND RECOVER IN DATABASE 28


➢ Immediate update – In the immediate update, the database may be updated by some
operations of a transaction before the transaction reaches its commit point. However, these
operations are recorded in a log on disk before they are applied to the database, making
recovery still possible. If a transaction fails to reach its commit point, the effect of its operation
must be undone i.e., the transaction must be rolled back hence we require both undo and redo.
This technique is known as the undo/redo algorithm.
➢ Caching/Buffering – In these one or more disk pages that include data items to be updated are
cached into main memory buffers and then updated in memory before being written back to
disk. A collection of in-memory buffers called the DBMS cache is kept under the control of
DBMS for holding these buffers. A directory is used to keep track of which database items are in
the buffer. A dirty bit is associated with each buffer, which is 0 if the buffer is not modified else
1 if modified.
➢ Shadow paging – It provides atomicity and durability. A directory with n entries is constructed,
where the ith entry points to the ith database page on the link. When a transaction begins
executing the current directory is copied into a shadow directory. When a page is to be
modified, a shadow page is allocated in which changes are made and when it is ready to
become durable, all pages that refer to the original are updated to refer new replacement page.
➢ Backward Recovery – The terms “Rollback” and “UNDO” can also refer to backward recovery.
When a backup of the data is not available and previous modifications need to be undone, this
technique can be helpful. With the backward recovery method, unused modifications are
removed, and the database is returned to its prior condition. All adjustments made during the
previous traction are reversed during the backward recovery. In other words, it reprocesses
valid transactions and undoes the erroneous database updates.
➢ Forward Recovery – “Roll forward “and “REDO” refers to forwarding recovery. When a
database needs to be updated with all changes verified, this forward recovery technique is
helpful.
Some failed transactions in this database are applied to the database to roll those modifications
forward. In other words, the database is restored using preserved data and valid transactions
counted by their past saves.

BACKUP AND RECOVER IN DATABASE 29


Some of the backup techniques are as follows:

➢ Full database backup – In this full database including data and database, Meta information
needed to restore the whole database, including full-text catalogues backed up in a predefined
time series.
➢ Differential backup – It stores only the data changes that have occurred since the last full
database backup. When some data has changed many times since the last full database backup,
a differential backup stores the most recent version of the changed data. For this first, we need
to restore a full database backup.
➢ Transaction log backup – In this, all events that have occurred in the database, like a record of
every single statement executed are backed up. It is the backup of transaction log entries and
contains all transactions that have happened to the database. Through this, the database can
be recovered to a specific point in time. It is even possible to perform a backup from a
transaction log if the data files are destroyed and not even a single committed transaction is
lost.

KEY TAKEAWAYS:-

Backup and recovery are critical aspects of database management to ensure data integrity and
availability. Here are common techniques used in database management systems (DBMS) for
backup and recovery:
Backup Techniques :-

Full Backups:
Description: Entire database is copied.
Advantages: Simple and straightforward.
Disadvantages: Consumes more storage and time.

BACKUP AND RECOVER IN DATABASE 30


Incremental Backups:
Description: Only data that has changed since the last backup is copied.
Advantages: Requires less storage space and time compared to full backups.
Disadvantages: Longer recovery time, as multiple backup sets may be needed for restoration.
Differential Backups:
Description: Copies data that has changed since the last full backup.
Advantages: Faster recovery than incremental backups.
Disadvantages: Requires more storage space compared to incremental backups.
Snapshot Backups:
Description: A point-in-time copy of the entire database or a subset.
Advantages: Provides a consistent view of data at a specific moment.
Disadvantages: Can impact performance during the snapshot creation.
Continuous Data Protection (CDP):
Description: Captures changes to data in real-time.
Advantages: Minimal data loss in case of failures.
Disadvantages: Can be resource-intensive.
Online Backups:
Description: Backups performed while the database is online and operational.
Advantages: Minimal downtime for users.
Disadvantages: May impact performance during the backup process.
Recovery Techniques:
Rollback:
Description: Undoing changes made to the database to restore a previous state.
Use Case: Transaction rollback during an error or failure.
Rollforward:
Description: Applying changes from a log to bring the database forward to a consistent state.
Use Case: Recovering transactions that occurred after the last backup.

BACKUP AND RECOVER IN DATABASE 31


Point-in-Time Recovery:
Description: Restoring the database to a specific point in time.
Use Case: Recovering to a state before a specific error or data corruption.
Media Recovery:
Description: Restoring from backups and applying transaction logs.
Use Case: Recovery after a hardware failure or data corruption.
Parallel Recovery:
Description: Distributing recovery processes across multiple resources.
Use Case: Accelerating recovery for large databases.
Flashback Technology:
Description: Allows for viewing and reverting to previous states without restoring from
backups.
Use Case: Recovering from logical errors or user mistakes.
Warm Standby and Hot Standby:
Description: Maintaining a secondary database that is synchronized with the primary.
Use Case: Seamless switch to a standby database in case of a primary database failure.
Backup Verification:
Description: Regularly testing and validating backup files.
Use Case: Ensuring that backups are reliable and can be successfully restored.
It's important to design a comprehensive backup and recovery strategy based on the specific
requirements and constraints of the database environment. Regular testing of backup and
recovery procedures is crucial to ensure their effectiveness when needed.

BACKUP AND RECOVER IN DATABASE 32

You might also like