0% found this document useful (0 votes)
60 views112 pages

Data Base Final

This document provides an overview of database concepts, including the evolution of databases, types of databases, database management system architecture, the database lifecycle, data modeling, entity relationship diagrams, normalization, and relational algebra. It discusses topics such as the three schema architecture with internal, conceptual, and external levels; entity relationship modeling with entities, attributes, relationships, and participation constraints; mapping ER diagrams to relational models; functional dependencies and normal forms like 1NF, 2NF, 3NF and BCNF; and relational algebra operations including select, project, union, difference, cartesian product, and rename.

Uploaded by

Danish Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views112 pages

Data Base Final

This document provides an overview of database concepts, including the evolution of databases, types of databases, database management system architecture, the database lifecycle, data modeling, entity relationship diagrams, normalization, and relational algebra. It discusses topics such as the three schema architecture with internal, conceptual, and external levels; entity relationship modeling with entities, attributes, relationships, and participation constraints; mapping ER diagrams to relational models; functional dependencies and normal forms like 1NF, 2NF, 3NF and BCNF; and relational algebra operations including select, project, union, difference, cartesian product, and rename.

Uploaded by

Danish Khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

DATA BASE SYSTEM

BY: DANISH KHAN


1

Table of Contents
Basic Database Concepts ...................................................................................................................... 6
Evolution of the database .................................................................................................................. 8
Types of databases ............................................................................................................................. 9
DBMS - Architecture .................................................................................................................................. 9
3-tier Architecture .............................................................................................................................. 10
Database Life Cycle ................................................................................................................................. 11
Analysis ................................................................................................................................................ 11
Database Design ................................................................................................................................ 11
Implementation ................................................................................................................................... 11
Operation.............................................................................................................................................. 12
Maintenance ........................................................................................................................................ 12
Three schema Architecture..................................................................................................................... 12
Objectives of Three Schema Architecture .................................................................................. 13
1. Internal Level .............................................................................................................................. 14
2. Conceptual Level ....................................................................................................................... 15
3. External Level ............................................................................................................................. 15
Mapping between Views................................................................................................................... 16
What is Data Modelling?................................................................................................................... 16
Data Models in DBMS ....................................................................................................................... 17
Types of Data Models in DBMS ...................................................................................................... 17
Entity Relationship Diagram (ERD) .......................................................................................................... 18
Component of ER Diagram .............................................................................................................. 19
1. Entity: ............................................................................................................................................ 19
2. Attribute ....................................................................................................................................... 20
3. Relationship ................................................................................................................................ 23
Participation Constraints ............................................................................................................. 25
Generalization Aggregation .................................................................................................................... 25
Generalization ..................................................................................................................................... 25
Specialization ...................................................................................................................................... 26
Inheritance ........................................................................................................................................... 26
Enhanced ER Model ................................................................................................................................ 27
Relation Data Model ................................................................................................................................ 31
2

Concepts .............................................................................................................................................. 31
Constraints .......................................................................................................................................... 32
Key Constraints .............................................................................................................................. 32
Domain Constraints....................................................................................................................... 32
Referential integrity Constraints ................................................................................................ 32
Mapping ER Model to Relational Model ............................................................................................... 33
Mapping Entity .................................................................................................................................... 33
Mapping Process (Algorithm) ..................................................................................................... 33
Mapping Relationship ....................................................................................................................... 33
Mapping Process ........................................................................................................................... 34
Mapping Weak Entity Sets ............................................................................................................... 34
Mapping Process ........................................................................................................................... 34
Mapping Hierarchical Entities......................................................................................................... 35
Mapping Process ........................................................................................................................... 35
Functional Dependency........................................................................................................................... 35
Types of Functional dependency .................................................................................................. 36
1. Trivial functional dependency ................................................................................................ 36
2. Non-trivial functional dependency ........................................................................................ 37
normalization .............................................................................................................................................. 37
Data normalization rules.................................................................................................................. 38
Data normalization example ........................................................................................................... 38
1. First Normal Form – ................................................................................................................. 38
2. Second Normal Form –............................................................................................................ 40
3. Third Normal Form – ................................................................................................................ 42
4. Boyce-Codd Normal Form (BCNF) – ................................................................................... 43
Database normalization tools ......................................................................................................... 44
Relational Algebra .................................................................................................................................... 44
Relational Algebra ............................................................................................................................. 44
Select Operation (σ) .......................................................................................................................... 44
Project Operation (∏) ........................................................................................................................ 45
Union Operation (∪) ........................................................................................................................... 45
Set Difference (−)................................................................................................................................ 45
Cartesian Product (Χ) ........................................................................................................................ 46
3

Rename Operation (ρ) ....................................................................................................................... 46


Relational Calculus ............................................................................................................................ 46
Tuple Relational Calculus (TRC) ................................................................................................ 46
Domain Relational Calculus (DRC) ............................................................................................ 47
SQL ............................................................................................................................................................ 47
Rules: ................................................................................................................................................ 48
SQL process:................................................................................................................................... 48
Characteristics of SQL............................................................................................................................. 49
Advantages of SQL .................................................................................................................................. 49
High speed ....................................................................................................................................... 49
No coding needed .......................................................................................................................... 50
Well defined standards ................................................................................................................. 50
Portability ......................................................................................................................................... 50
Interactive language ...................................................................................................................... 50
Multiple data view .......................................................................................................................... 50
SQL Datatype ........................................................................................................................................... 50
Datatype of SQL: ................................................................................................................................ 51
1. Binary Datatypes ....................................................................................................................... 51
2. Approximate Numeric Datatype : .......................................................................................... 51
3. Exact Numeric Datatype .......................................................................................................... 52
4. Character String Datatype ....................................................................................................... 52
5. Date and time Datatypes .......................................................................................................... 53
SQL Commands ....................................................................................................................................... 53
Types of SQL Commands ................................................................................................................ 53
1. Data Definition Language (DDL) ............................................................................................ 54
2. Data Manipulation Language .................................................................................................. 56
3. Data Control Language ............................................................................................................ 57
4. Transaction Control Language .............................................................................................. 58
5. Data Query Language ............................................................................................................... 59
SQL Operator............................................................................................................................................ 59
SQL Arithmetic Operators ............................................................................................................... 60
SQL Comparison Operators: .......................................................................................................... 61
SQL Logical Operators ..................................................................................................................... 62
4

SQL Table ................................................................................................................................................. 63


Operation on Table ............................................................................................................................ 63
SQL Create Table ........................................................................................................................... 64
Drop table ......................................................................................................................................... 65
SQL DELETE table ......................................................................................................................... 66
Views in SQL............................................................................................................................................. 68
Sample table:................................................................................................................................... 68
1. Creating view .................................................................................................................................. 69
2. Creating View from a single table ............................................................................................. 69
3. Creating View from multiple tables ........................................................................................... 70
4. Deleting View .................................................................................................................................. 71
SQL Index.................................................................................................................................................. 71
1. Create Index statement ................................................................................................................ 71
2. Unique Index statement ............................................................................................................... 72
3. Drop Index Statement ................................................................................................................... 72
SQL Sub Query ........................................................................................................................................ 72
1. Subqueries with the Select Statement ..................................................................................... 73
2. Subqueries with the INSERT Statement .................................................................................. 74
3. Subqueries with the UPDATE Statement ................................................................................ 75
4. Subqueries with the DELETE Statement ................................................................................. 76
SQL Aggregate Functions....................................................................................................................... 77
Types of SQL Aggregation Function ............................................................................................ 78
1. COUNT FUNCTION .................................................................................................................... 78
2. SUM Function ............................................................................................................................. 81
3. AVG function............................................................................................................................... 82
4. MAX Function ............................................................................................................................. 82
5. MIN Function ............................................................................................................................... 83
SQL JOIN .................................................................................................................................................. 83
Types of SQL JOIN ............................................................................................................................ 84
Sample Table ................................................................................................................................... 84
1. INNER JOIN ................................................................................................................................. 85
2. LEFT JOIN .................................................................................................................................... 86
3. RIGHT JOIN ................................................................................................................................. 87
5

4. FULL JOIN ................................................................................................................................... 88


transaction Processing Systems ............................................................................................................ 89
Transactions........................................................................................................................................ 89
Transaction Operations .................................................................................................................... 89
Transaction States ............................................................................................................................. 90
Desirable Properties of Transactions ........................................................................................... 91
Schedules and Conflicts .................................................................................................................. 91
Types of Schedules ....................................................................................................................... 91
Conflicts in Schedules .................................................................................................................. 92
Serializability ....................................................................................................................................... 92
Equivalence of Schedules ........................................................................................................... 92
Concurrency .............................................................................................................................................. 92
concurrency control.................................................................................................................................. 93
Advantages .......................................................................................................................................... 93
Control concurrency ......................................................................................................................... 93
Main problems in using Concurrency .......................................................................................... 94
Concurrency control techniques ................................................................................................... 94
Locking ................................................................................................................................................. 94
Time Stamping .................................................................................................................................... 94
Optimistic ............................................................................................................................................. 95
Recovery Techniques in DBMS ............................................................................................................. 95
What is Query Optimization? .................................................................................................................. 98
Kerberos .................................................................................................................................................. 102
Kerberos Limitations ..................................................................................................................... 104
Is Kerberos Infallible? ................................................................................................................... 104
What is Kerberos Used For? ....................................................................................................... 104
data integrity ............................................................................................................................................. 105
Data Integrity vs. Data Quality vs. Data Security ................................................................................ 105
Integrity of Data In a Database Table .................................................................................................. 106
Types of Data Integrity ......................................................................................................................... 106
Physical Integrity .............................................................................................................................. 106
Logical Integrity................................................................................................................................ 106
Importance of Integrity in Data ........................................................................................................... 107
6

Basic Database Concepts


Database system is an excellent computer-based record-keeping system. A
collection of data, commonly called a database, contains information about a
particular enterprise. It maintains any information that may necessary to the
decision-making process involved in the management of that organization. It can
also be defined as a collection of interrelated data stored together to serve
multiple applications, the data is stored so that it is independent of programs that
use the data. A generic and controlled approach is used to add new data and
modify and retrieve existing data within the database. The data is structured so
as to provide the basis for future application development.
Purpose of Database
The intent of a database is that a collection of data should serve as many
applications as possible. Therefore, a database is often thought of as a repository
of information needed to run certain functions in a corporation or organization. It
would permit only the retrieval of data but also the continuous modification of data
needed for the control of operations. It may be possible to search the database
to obtain answers to questions or information for planning purposes.
In a typical file-processing system, permanent records are stored in different files.
Many different application programs are written to extract the records and add
the records to the appropriate files. But this scheme has several major limitations
and disadvantages, such as data redundancy (duplication of data), data
inconsistency, maladaptive data, non-standard data, insecure data, incorrect
data, etc. A database management system is an answer to all these problems as
it provides centralized control of the data.
Database Abstraction
A major purpose of a database is to provide the user with only as much
information as is required of them. This means that the system does not disclose
all the details of the data, rather it hides some details of how the data is stored
and maintained. The complexity of databases is hidden from them which, if
necessary, are ordered through multiple levels of abstraction to facilitate their
interaction with the system. The different levels of the database are implemented
through three layers:
1. Internal Level (Physical Level): The lowest level of abstraction, the internal
level, is closest to physical storage. It describes how the data is stored
concretely on the storage medium.
2. Conceptual Level: This level of abstraction describes what data is concretely
stored in the database. It also describes the relationships that exist between
7

the data. At this level, databases are described logically in terms of simple
data structures. Users at this level are not concerned with how these logical
data structures will be implemented at the physical level.
3. External Level (View Level): It is the level closest to users and is related to
the way the data is viewed by individual users.

Since a database can be viewed through three levels of abstraction, any change
at one level can affect plans at other levels. As databases continue to grow, there
may be frequent changes to it at times. This should not lead to redesign and re-
implementation of the database. In such a context the concept of data
independence proves beneficial.
Concept of Database
To store and manage data efficiently in the database let us understand some key
terms:
1. Database Schema: It is a design of the database. Or we can say that it is a
skeleton of the database that is used to represent the structure, types of data will
be stored in the rows and columns, constraints, relationships between the tables.
2. Data Constraints: In a database, sometimes we put some restrictions on the
table that what type of data can be stored in one or more columns of the table, it
can be done by using constraints. Constraints are defined while we are creating
a table.
3. Data dictionary or Metadata: Metadata is known as the data about the data.
Or we can say that the database schema along with different types of constraints
on the data is stored by DBMS in the dictionary is known as metadata.
4. Database instance: In a database, a database instance is used to define the
complete database environment and its components. Or we can say that it is a
8

set of memory structures and background processes that are used to access the
database files.
5. Query: In a database, a query is used to access data from the database. So
users have to write queries to retrieve or manipulate data from the database.
6. Data manipulation: In a database, we can easily manipulate data using the
three main operations that is Insertion, Deletion, and updation.
7. Data Engine: It is an underlying component that is used to create and manage
various database queries.

Advantages of Database
Let us consider some of the benefits provided by a database system and see how
a database system overcomes the above-mentioned problems: -
1. Reduces database data redundancy to a great extent
2. The database can control data inconsistency to a great extent
3. The database facilitates sharing of data.
4. Database enforce standards.
5. The database can ensure data security.
6. Integrity can be maintained through databases.
Therefore, for systems with better performance and efficiency, database systems
are preferred.
Disadvantages of Database
With the complex tasks to be performed by the database system, some things
may come up which can be termed as the disadvantages of using the database
system. These are: -
1. Security may be compromised without good controls.
2. Integrity may be compromised without good controls.
3. Extra hardware may be required
4. Performance overhead may be significant.
5. The system is likely to be complex.

Evolution of the database


Databases have evolved dramatically since their inception in the early
1960s. Navigational databases such as the hierarchical database (which
relied on a tree-like model and allowed only a one-to-many relationship),
and the network database (a more flexible model that allowed multiple
relationships), were the original systems used to store and manipulate data.
Although simple, these early systems were inflexible. In the 1980s, relational
databases became popular, followed by object-oriented databases in the
1990s. More recently, NoSQL databases came about as a response to the
9

growth of the internet and the need for faster speed and processing of
unstructured data. Today, cloud databases and self-driving databases are
breaking new ground when it comes to how data is collected, stored,
managed, and utilized.

Types of databases
There are many different types of databases. The best database for a
specific organization depends on how the organization intends to use the
data.

Relational databases
 Relational databases became dominant in the 1980s. Items in a
relational database are organized as a set of tables with columns and
rows. Relational database technology provides the most efficient and
flexible way to access structured information.

Object-oriented databases
 Information in an object-oriented database is represented in the form of
objects, as in object-oriented programming.

Distributed databases
 A distributed database consists of two or more files located in different
sites. The database may be stored on multiple computers, located in the
same physical location, or scattered over different networks.

Data warehouses
 A central repository for data, a data warehouse is a type of database
specifically designed for fast query and analysis.

DBMS - Architecture
The design of a DBMS depends on its architecture. It can be centralized or
decentralized or hierarchical. The architecture of a DBMS can be seen as
either single tier or multi-tier. An n-tier architecture divides the whole system
10

into related but independent n modules, which can be independently


modified, altered, changed, or replaced.
In 1-tier architecture, the DBMS is the only entity where the user directly sits
on the DBMS and uses it. Any changes done here will directly be done on
the DBMS itself. It does not provide handy tools for end-users. Database
designers and programmers normally prefer to use single-tier architecture.
If the architecture of DBMS is 2-tier, then it must have an application through
which the DBMS can be accessed. Programmers use 2-tier architecture
where they access the DBMS by means of an application. Here the
application tier is entirely independent of the database in terms of operation,
design, and programming.
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the
complexity of the users and how they use the data present in the database.
It is the most widely used architecture to design a DBMS.

 Database (Data) Tier − At this tier, the database resides along with its
query processing languages. We also have the relations that define the
data and their constraints at this level.
 Application (Middle) Tier − At this tier reside the application server
and the programs that access the database. For a user, this application
tier presents an abstracted view of the database. End-users are
unaware of any existence of the database beyond the application. At
the other end, the database tier is not aware of any other user beyond
the application tier. Hence, the application layer sits in the middle and
acts as a mediator between the end-user and the database.
 User (Presentation) Tier − End-users operate on this tier and they
know nothing about any existence of the database beyond this layer.
11

At this layer, multiple views of the database can be provided by the


application. All views are generated by applications that reside in the
application tier.
Multiple-tier database architecture is highly modifiable, as almost all its
components are independent and can be changed independently.

Database Life Cycle


Life cycle of database begins with analysis and defining of the problems and
objectives.
The following figure displays the life cycle of a database that begins with
analysis, including feasibility study −

Let us see the steps involved −


Analysis
In the first phase, the current system’s operation is analyzed and problems
are defined. Here the objectives are also defined.
Database Design
Here steps are taken for the final product to meet the user and system
requirements
Implementation
Design specifications are implemented here.
12

Operation
Now the database is operational.
Maintenance
DBA performs maintenance that includes backup and recovery.

Three schema Architecture


o The three schema architecture is also called ANSI/SPARC architecture
or three-level architecture.
o This framework is used to describe the structure of a specific database
system.
o The three schema architecture is also used to separate the user
applications and physical database.
o The three schema architecture contains three-levels. It breaks the
database down into three different categories.

The three-schema architecture is as follows:


13

In the above diagram:

o It shows the DBMS architecture.


o Mapping is used to transform the request and response between
various database levels of architecture.
o Mapping is not good for small DBMS because it takes more time.
o In External / Conceptual mapping, it is necessary to transform the
request from external level to conceptual schema.
o In Conceptual / Internal mapping, DBMS transform the request from
the conceptual to internal level.

Objectives of Three Schema Architecture


The main objective of three level architecture is to enable multiple
users to access the same data with a personalized view while
storing the underlying data only once. Thus it separates the user's
view from the physical structure of the database. This separation is
desirable for the following reasons:

o Different users need different views of the same data.


o The approach in which a particular user needs to see the data may
change over time.
o The users of the database should not worry about the physical
implementation and internal workings of the database such as data
compression and encryption techniques, hashing, optimization of the
internal structures etc.
o All users should be able to access the same data according to their
requirements.
o DBA should be able to change the conceptual structure of the database
without affecting the user's
o Internal structure of the database should be unaffected by changes to
physical aspects of the storage.
14

1. Internal Level

o The internal level has an internal schema which describes the physical
storage structure of the database.
o The internal schema is also known as a physical schema.
o It uses the physical data model. It is used to define that how the data
will be stored in a block.
o The physical level is used to describe complex low-level data structures
in detail.

The internal level is generally being concerned with the following


activities:

o Storage-space-allocations.
For Example: B-Trees, Hashing etc.
o Access-paths.
For Example: Specification of primary and secondary keys, indexes,
pointers and sequencing.
o Data compression and encryption techniques.
o Optimization of internal structures.
o Representation of stored fields.
15

2. Conceptual Level

o The conceptual schema describes the design of a database at the


conceptual level. Conceptual level is also known as logical level.
o The conceptual schema describes the structure of the whole database.
o The conceptual level describes what data are to be stored in the
database and also describes what relationship exists among those data.
o In the conceptual level, internal details such as an implementation of
the data structure are hidden.
o Programmers and database administrators work at this level.

3. External Level

o At the external level, a database contains several schemas that


sometimes called as subschema. The subschema is used to describe
the different view of the database.
o An external schema is also known as view schema.
o Each view schema describes the database part that a particular user
group is interested and hides the remaining database from that user
group.
o The view schema describes the end user interaction with database
systems.
16

Mapping between Views


The three levels of DBMS architecture don't exist independently of
each other. There must be correspondence between the three
levels i.e. how they actually correspond with each other. DBMS is
responsible for correspondence between the three types of
schema. This correspondence is called Mapping.

There are basically two types of mapping in the database


architecture:

o Conceptual/ Internal Mapping


o External / Conceptual Mapping

Conceptual/ Internal Mapping

The Conceptual/ Internal Mapping lies between the conceptual


level and the internal level. Its role is to define the correspondence
between the records and fields of the conceptual level and files and
data structures of the internal level.

External/ Conceptual Mapping

The external/Conceptual Mapping lies between the external level


and the Conceptual level. Its role is to define the correspondence
between a particular external and the conceptual view.

What is Data Modelling?

Data modeling (data modelling) is the process of creating a data model for
the data to be stored in a database. This data model is a conceptual
representation of Data objects, the associations between different data
objects, and the rules.
17

Data modeling helps in the visual representation of data and enforces


business rules, regulatory compliances, and government policies on the data.
Data Models ensure consistency in naming conventions, default values,
semantics, security while ensuring quality of the data.

Data Models in DBMS

The Data Model is defined as an abstract model that organizes data


description, data semantics, and consistency constraints of data. The data
model emphasizes on what data is needed and how it should be organized
instead of what operations will be performed on data. Data Model is like an
architect’s building plan, which helps to build conceptual models and set a
relationship between data items.

Types of Data Models in DBMS

Types of Data Models: There are mainly three different types of data
models: conceptual data models, logical data models, and physical data
models, and each one has a specific purpose. The data models are used
to represent the data and how it is stored in the database and to set the
relationship between data items.
1. Conceptual Data Model: This Data Model defines WHAT the system
contains. This model is typically created by Business stakeholders and Data
Architects. The purpose is to organize, scope and define business concepts
and rules.
2. Logical Data Model: Defines HOW the system should be implemented
regardless of the DBMS. This model is typically created by Data Architects
and Business Analysts. The purpose is to developed technical map of rules
and data structures.
3. Physical Data Model: This Data Model describes HOW the system will be
implemented using a specific DBMS system. This model is typically created
18

by DBA and developers. The purpose is actual implementation of the


database.

Types of Data Model

Entity Relationship Diagram


(ERD)
Let us now learn how the ER Model is represented by means of an ER
diagram. Any object, for example, entities, attributes of an entity, relationship
sets, and attributes of relationship sets, can be represented with the help of
an ER diagram.
o ER model stands for an Entity-Relationship model. It is a high-level data
model. This model is used to define the data elements and relationship
for a specified system.
o It develops a conceptual design for the database. It also develops a
very simple and easy to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called
an entity-relationship diagram.

For example, Suppose we design a school database. In this database, the


student will be an entity with attributes like address, name, id, age, etc. The
19

address can be another entity with attributes like city, street name, pin code,
etc and there will be a relationship between them.

Component of ER Diagram

1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an
entity can be represented as rectangles.
20

Consider an organization as an example- manager, product, employee,


department etc. can be taken as an entity.

a. Weak Entity

An entity that depends on another entity called a weak entity. The weak entity
doesn't contain any key attribute of its own. The weak entity is represented
by a double rectangle.

2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to
represent an attribute.

For example, id, age, contact number, name, etc. can be attributes of a
student.
21

a. Key Attribute

The key attribute is used to represent the main characteristics of an entity. It


represents a primary key. The key attribute is represented by an ellipse with
the text underlined.

b. Composite Attribute

An attribute that composed of many other attributes is known as a composite


attribute. The composite attribute is represented by an ellipse, and those
ellipses are connected with an ellipse.
22

c. Multivalued Attribute

An attribute can have more than one value. These attributes are known as a
multivalued attribute. The double oval is used to represent multivalued
attribute.

For example, a student can have more than one phone number.

d. Derived Attribute

An attribute that can be derived from other attribute is known as a derived


attribute. It can be represented by a dashed ellipse.

For example, A person's age changes over time and can be derived from
another attribute like Date of birth.
23

3. Relationship
A relationship is used to describe the relation between entities. Diamond or
rhombus is used to represent the relationship.

Types of relationship are as follows:

a. One-to-One Relationship

When only one instance of an entity is associated with the relationship, then
it is known as one to one relationship.

For example, A female can marry to one male, and a male can marry to one
female.

b. One-to-many relationship

When only one instance of the entity on the left, and more than one instance
of an entity on the right associates with the relationship then this is known
as a one-to-many relationship.

For example, Scientist can invent many inventions, but the invention is done
by the only specific scientist.
24

c. Many-to-one relationship

When more than one instance of the entity on the left, and only one instance
of an entity on the right associates with the relationship then it is known as
a many-to-one relationship.

For example, Student enrolls for only one course, but a course can have
many students.

d. Many-to-many relationship

When more than one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then it is
known as a many-to-many relationship.

For example, Employee can assign by many projects and project can have
many employees.
25

Participation Constraints
 Total Participation − Each entity is involved in the relationship. Total
participation is represented by double lines.
 Partial participation − Not all entities are involved in the relationship.
Partial participation is represented by single lines.

Generalization Aggregation
The ER Model has the power of expressing database entities in a conceptual
hierarchical manner. As the hierarchy goes up, it generalizes the view of
entities, and as we go deep in the hierarchy, it gives us the detail of every
entity included.
Going up in this structure is called generalization, where entities are
clubbed together to represent a more generalized view. For example, a
particular student named Mira can be generalized along with all the students.
The entity shall be a student, and further, the student is a person. The
reverse is called specialization where a person is a student, and that
student is Mira.
Generalization
As mentioned above, the process of generalizing entities, where the
generalized entities contain the properties of all the generalized entities, is
called generalization. In generalization, a number of entities are brought
together into one generalized entity based on their similar characteristics.
For example, pigeon, house sparrow, crow and dove can all be generalized
as Birds.
26

Specialization

Specialization is the opposite of generalization. In specialization, a group of


entities is divided into sub-groups based on their characteristics. Take a
group ‘Person’ for example. A person has name, date of birth, gender, etc.
These properties are common in all persons, human beings. But in a
company, persons can be identified as employee, employer, customer, or
vendor, based on what role they play in the company.

Similarly, in a school database, persons can be specialized as teacher,


student, or a staff, based on what role they play in school as entities.

Inheritance
We use all the above features of ER-Model in order to create classes of
objects in object-oriented programming. The details of entities are generally
hidden from the user; this process known as abstraction.
27

Inheritance is an important feature of Generalization and Specialization. It


allows lower-level entities to inherit the attributes of higher-level entities.

For example, the attributes of a Person class such as name, age, and gender
can be inherited by lower-level entities such as Student or Teacher.

Enhanced ER Model
Enhanced entity-relationship diagrams are advanced database diagrams
very similar to regular ER diagrams which represent requirements and
complexities of complex databases.
It is a diagrammatic technique for displaying the Sub Class and Super
Class; Specialization and Generalization; Union or Category; Aggregation
etc.
Generalization and Specialization –
These are very common relationships found in real entities. However, this
kind of relationship was added later as an enhanced extension to the
classical ER model. Specialized classes are often called subclass while
a generalized class is called a superclass, probably inspired by object-
oriented programming. A sub-class is best understood by “IS-A
28

analysis”. Following statements hopefully makes some sense to your


mind “Technician IS-A Employee”, “Laptop IS-A Computer”.
An entity is a specialized type/class of another entity. For example, a
Technician is a special Employee in a university system Faculty is a
special class of Employees. We call this phenomenon
generalization/specialization. In the example here Employee is a
generalized entity class while the Technician and Faculty are specialized
classes of Employee.
Example – This example instance of “sub-class” relationships. Here we
have four sets of employees: Secretary, Technician, and Engineer. The
employee is super-class of the rest three sets of individual sub-class is a
subset of Employee set.
29

 An entity belonging to a sub-class is related to some super-class entity.


For instance, emp, no 1001 is a secretary, and his typing speed is 68.
Emp no 1009 is an engineer (sub-class) and her trade is “Electrical”, so
forth.
 Sub-class entity “inherits” all attributes of super-class; for example,
employee 1001 will have attributes eno, name, salary, and typing
speed.
Enhanced ER model of above example –

Constraints – There are two types of constraints on the “Sub-class”


relationship.
30

1. Total or Partial – A sub-classing relationship is total if every super-


class entity is to be associated with some sub-class entity, otherwise
partial. Sub-class “job type based employee category” is partial sub-
classing – not necessary every employee is one of (secretary,
engineer, and technician), i.e. union of these three types is a proper
subset of all employees. Whereas other sub-classing “Salaried
Employee AND Hourly Employee” is total; the union of entities from
sub-classes is equal to the total employee set, i.e. every employee
necessarily has to be one of them.
2. Overlapped or Disjoint – If an entity from super-set can be related
(can occur) in multiple sub-class sets, then it is overlapped sub-
classing, otherwise disjoint. Both the examples: job-type based and
salaries/hourly employee sub-classing are disjoint.
Note – These constraints are independent of each other: can be
“overlapped and total or partial” or “disjoint and total or partial”. Also, sub-
classing has transitive properties.
Multiple Inheritance (sub-class of multiple superclasses) –
An entity can be a sub-class of multiple entity types; such entities are sub-
class of multiple entities and have multiple super-classes; Teaching
Assistant can subclass of Employee and Student both. A faculty in a
university system can be a subclass of Employee and Alumnus. In
multiple inheritances, attributes of sub-class are the union of attributes of
all super-classes.
Union –

 Set of Library Members is UNION of Faculty, Student, and Staff. A


union relationship indicates either type; for example, a library member
is either Faculty or Staff or Student.
 Below are two examples that show how UNION can be depicted in
ERD – Vehicle Owner is UNION of PERSON and Company, and RTO
Registered Vehicle is UNION of Car and Truck.
31

You might see some confusion in Sub-class and UNION; consider an


example in above figure Vehicle is super-class of CAR and Truck; this is
very much the correct example of the subclass as well but here use it
differently we are saying RTO Registered vehicle is UNION of Car and
Vehicle, they do not inherit any attribute of Vehicle, attributes of car and
truck are altogether independent set, where is in sub-classing situation car
and truck would be inheriting the attribute of vehicle class.
Relation Data Model
Relational data model is the primary data model, which is used widely around
the world for data storage and processing. This model is simple and it has
all the properties and capabilities required to process data with storage
efficiency.
Concepts
Tables − In relational data model, relations are saved in the format of Tables.
This format stores the relation among entities. A table has rows and columns,
where rows represents records and columns represent the attributes.
Tuple − A single row of a table, which contains a single record for that
relation is called a tuple.
Relation instance − A finite set of tuples in the relational database system
represents relation instance. Relation instances do not have duplicate tuples.
32

Relation schema − A relation schema describes the relation name (table


name), attributes, and their names.
Relation key − Each row has one or more attributes, known as relation key,
which can identify the row in the relation (table) uniquely.
Attribute domain − Every attribute has some pre-defined value scope,
known as attribute domain.
Constraints
Every relation has some conditions that must hold for it to be a valid relation.
These conditions are called Relational Integrity Constraints. There are
three main integrity constraints −
 Key constraints
 Domain constraints
 Referential integrity constraints
Key Constraints
There must be at least one minimal subset of attributes in the relation, which
can identify a tuple uniquely. This minimal subset of attributes is
called key for that relation. If there are more than one such minimal subsets,
these are called candidate keys.
Key constraints force that −
 in a relation with a key attribute, no two tuples can have identical values
for key attributes.
 a key attribute cannot have NULL values.
Key constraints are also referred to as Entity Constraints.
Domain Constraints
Attributes have specific values in real-world scenario. For example, age can
only be a positive integer. The same constraints have been tried to employ
on the attributes of a relation. Every attribute is bound to have a specific
range of values. For example, age cannot be less than zero and telephone
numbers cannot contain a digit outside 0-9.
Referential integrity Constraints
Referential integrity constraints work on the concept of Foreign Keys. A
foreign key is a key attribute of a relation that can be referred in other relation.
33

Referential integrity constraint states that if a relation refers to a key attribute


of a different or same relation, then that key element must exist.
Mapping ER Model to Relational Model

ER Model, when conceptualized into diagrams, gives a good overview of


entity-relationship, which is easier to understand. ER diagrams can be
mapped to relational schema, that is, it is possible to create relational
schema using ER diagram. We cannot import all the ER constraints into
relational model, but an approximate schema can be generated.
There are several processes and algorithms available to convert ER
Diagrams into Relational Schema. Some of them are automated and some
of them are manual. We may focus here on the mapping diagram contents
to relational basics.
ER diagrams mainly comprise of −
 Entity and its attributes
 Relationship, which is association among entities.
Mapping Entity
An entity is a real-world object with some attributes.

Mapping Process (Algorithm)


 Create table for each entity.
 Entity's attributes should become fields of tables with their respective
data types.
 Declare primary key.
Mapping Relationship
A relationship is an association among entities.
34

Mapping Process
 Create table for a relationship.
 Add the primary keys of all participating Entities as fields of table with
their respective data types.
 If relationship has any attribute, add each attribute as field of table.
 Declare a primary key composing all the primary keys of participating
entities.
 Declare all foreign key constraints.
Mapping Weak Entity Sets
A weak entity set is one which does not have any primary key associated
with it.

Mapping Process
 Create table for weak entity set.
 Add all its attributes to table as field.
 Add the primary key of identifying entity set.
 Declare all foreign key constraints.
35

Mapping Hierarchical Entities


ER specialization or generalization comes in the form of hierarchical entity
sets.

Mapping Process
 Create tables for all higher-level entities.
 Create tables for lower-level entities.
 Add primary keys of higher-level entities in the table of lower-level
entities.
 In lower-level tables, add all other attributes of lower-level entities.
 Declare primary key of higher-level table and the primary key for lower-
level table.
 Declare foreign key constraints.
Functional Dependency

The functional dependency is a relationship that exists between two


attributes. It typically exists between the primary key and non-key attribute
within a table.

1. X → Y
36

The left side of FD is known as a determinant, the right side of the production
is known as a dependent.

For example:

Assume we have an employee table with attributes: Emp_Id, Emp_Name,


Emp_Address.dependent on Emp_Id.

Here Emp_Id attribute can uniquely identify the Emp_Name attribute of


employee table because if we know the Emp_Id, we can tell that employee
name associated with it.

Functional dependency can be written as:

1. Emp_Id → Emp_Name

Types of Functional dependency

1. Trivial functional dependency


o A → B has trivial functional dependency if B is a subset of A.
o The following dependencies are also trivial like: A → A, B → B

Example:

1. Consider a table with two columns Employee_Id and Employee_Name.


37

2. {Employee_id, Employee_Name} → Employee_Id is a trivial functional dep


endency as
3. Employee_Id is a subset of {Employee_Id, Employee_Name}.
4. Also, Employee_Id → Employee_Id and Employee_Name → Employee_Na
me are trivial dependencies too.
2. Non-trivial functional dependency
o A → B has a non-trivial functional dependency if B is not a subset of A.
o When A intersection B is NULL, then A → B is called as complete non-
trivial.

Example:

1. ID → Name,
2. Name → DOB
3.
normalization

Database normalization is the process of organizing data into tables in


such a way that the results of using the database are always unambiguous
and as intended. Such normalization is intrinsic to relational
database theory. It may have the effect of duplicating data within the
database and often results in the creation of additional tables.

The concept of database normalization is generally traced back to E.F.


Codd, an IBM researcher who, in 1970, published a paper describing the
relational database model. What Codd described as "a normal form for
database relations" was an essential element of the relational technique.
Such data normalization found a ready audience in the 1970s and 1980s --
a time when disk drives were quite expensive and a highly efficient means
for data storage was very necessary. Since that time, other techniques,
including denormalization, have also found favor.
38

Data normalization rules


While data normalization rules tend to increase the duplication of data, it does not
introduce data redundancy, which is unnecessary
duplication. Database normalization is typically a refinement process after the
initial exercise of identifying the data objects that should be in the relational
database, identifying their relationships and defining the tables required and the
columns within each table.

Data normalization example


Customer Item Purchase price
purchased

Thomas Shirt $40

Maria Tennis shoes $35

Evelyn Shirt $40

Pajaro Trousers $25

If this table is used for the purpose of keeping track of the price of items and the
user want to delete one of the customers, he or she will also delete the price.
Normalizing the data would mean understanding this and solving the problem by
dividing this table into two tables, one with information about each customer and
the product they bought and the second with each product and its price. Making
additions or deletions to either table would not affect the other.

1. First Normal Form –

If a relation contain composite or multi-valued attribute, it violates first


normal form or a relation is in first normal form if it does not contain any
composite or multi-valued attribute. A relation is in first normal form if
every attribute in that relation is singled valued attribute.
39

 Example 1 – Relation STUDENT in table 1 is not in 1NF because of


multi-valued attribute STUD_PHONE. Its decomposition into 1NF has
been shown in table 2.

 Example 2 –

 ID Name Courses
 ------------------
 1 A c1, c2
 2 E c3
 3 M C2, c3
In the above table Course is a multi-valued attribute so it is not in 1NF.
Below Table is in 1NF as there is no multi-valued attribute
ID Name Course
------------------
1 A c1
1 A c2
2 E c3
40

3 M c2
3 M c3

2. Second Normal Form –

To be in second normal form, a relation must be in first normal form and


relation must not contain any partial dependency. A relation is in 2NF if it
has No Partial Dependency, i.e., no non-prime attribute (attributes which
are not part of any candidate key) is dependent on any proper subset of
any candidate key of the table.

Partial Dependency – If the proper subset of candidate key determines


non-prime attribute, it is called partial dependency.
 Example 1 – Consider table-3 as following below.
 STUD_NO COURSE_NO COURSE_FEE
 1 C1 1000
 2 C2 1500
 1 C4 2000
 4 C3 1000
 4 C1 1000
 2 C5 2000
{Note that, there are many courses having the same course fee. }
Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or
STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of
STUD_NO;
Hence,
COURSE_FEE would be a non-prime attribute, as it does not belong to
the one only candidate key {STUD_NO, COURSE_NO} ;
But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is
dependent on COURSE_NO, which is a proper subset of the candidate
41

key. Non-prime attribute COURSE_FEE is dependent on a proper


subset of the candidate key, which is a partial dependency and so this
relation is not in 2NF.
To convert the above relation to 2NF,
we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO
COURSE_FEE
1 C1 C1
1000
2 C2 C2
1500
1 C4 C3
1000
4 C3 C4
2000
4 C1 C5
2000
2 C5
NOTE: 2NF tries to reduce the redundant data getting stored in
memory. For instance, if there are 100 students taking C1 course, we
don’t need to store its Fee as 1000 for all the 100 records, instead,
once we can store it in the second table as the course fee for C1 is
1000.
 Example 2 – Consider following functional dependencies in relation R
(A, B , C, D )
 AB -> C [A and B together determine C]
BC -> D [B and C together determine D]
In the above relation, AB is the only candidate key and there is no
partial dependency, i.e., any proper subset of AB doesn’t determine
any non-prime attribute.
42

3. Third Normal Form –

A relation is in third normal form, if there is no transitive dependency for


non-prime attributes as well as it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in
every non-trivial function dependency X –> Y
1. X is a super key.
2. Y is a prime attribute (each element of Y is part of some candidate
key).

Transitive dependency – If A->B and B->C are two FDs then A->C is
called transitive dependency.
 Example 1 – In relation STUDENT given in Table 4,
FD set: {STUD_NO -> STUD_NAME, STUD_NO ->
STUD_STATE, STUD_STATE -> STUD_COUNTRY, STUD_NO -
> STUD_AGE}
Candidate Key: {STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and
STUD_STATE -> STUD_COUNTRY are true. So
STUD_COUNTRY is transitively dependent on STUD_NO. It
violates the third normal form. To convert it in third normal form,
we will decompose the relation STUDENT (STUD_NO,
STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_COUNTRY_STUD_AGE) as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE,
STUD_STATE, STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)
 Example 2 – Consider relation R(A, B, C, D, E)
A -> BC,
CD -> E,
B -> D,
E -> A
43

All possible candidate keys in above relation are {A, E, CD, BC}
All attributes are on right sides of all functional dependencies are
prime.

4. Boyce-Codd Normal Form (BCNF) –

A relation R is in BCNF if R is in Third Normal Form and for every


FD, LHS is super key. A relation is in BCNF iff in every non-trivial
functional dependency X –> Y, X is a super key.
 Example 1 – Find the highest normal form of a relation
R(A,B,C,D,E) with FD set as {BC->D, AC->BE, B->E}
Step 1. As we can see, (AC)+ ={A,C,B,E,D} but none of its
subset can determine all attribute of relation, So AC will be
candidate key. A or C can’t be derived from any other
attribute of the relation, so there will be only 1 candidate
key {AC}.
Step 2. Prime attributes are those attributes that are part of
candidate key {A, C} in this example and others will be non-
prime {B, D, E} in this example.
Step 3. The relation R is in 1st normal form as a relational
DBMS does not allow multi-valued or composite attribute.
The relation is in 2nd normal form because BC->D is in 2nd
normal form (BC is not a proper subset of candidate key
AC) and AC->BE is in 2nd normal form (AC is candidate
key) and B->E is in 2nd normal form (B is not a proper
subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D
(neither BC is a super key nor D is a prime attribute) and in
B->E (neither B is a super key nor E is a prime attribute) but
to satisfy 3rd normal for, either LHS of an FD should be
super key or RHS should be prime attribute.
So the highest normal form of relation will be 2nd Normal
form.
 Example 2 –For example consider relation R(A, B, C)
A -> BC,
B ->
44

Database normalization tools


Data modeling software can incorporate features that help automate preparing
incoming data for analysis. IT managers still need to develop a plan to address
common problems, including data normalization. Vendors in data normalization
include 360Science, ApexSQL and many other smaller niche developers.

Relational Algebra
Relational Algebra
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It uses
operators to perform queries. An operator can be either unary or binary.
They accept relations as their input and yield relations as their output.
Relational algebra is performed recursively on a relation and intermediate
results are also considered relations.
The fundamental operations of relational algebra are as follows −
 Select
 Project
 Union
 Set different
 Cartesian product
 Rename
We will discuss all these operations in the following sections.
Select Operation (σ)
It selects tuples that satisfy the given predicate from a relation.
Notation − σp(r)
Where σ stands for selection predicate and r stands for relation. p is
prepositional logic formula which may use connectors like and, or, and not.
These terms may use relational operators like − =, ≠, ≥, < , >, ≤.
For example −
σsubject = "database"(Books)
Output − Selects tuples from books where subject is 'database'.
σsubject = "database" and price = "450"(Books)
45

Output − Selects tuples from books where subject is 'database' and 'price'
is 450.
σsubject = "database" and price = "450" or year > "2010"(Books)
Output − Selects tuples from books where subject is 'database' and 'price'
is 450 or those books published after 2010.
Project Operation (∏)
It projects column(s) that satisfy a given predicate.
Notation − ∏A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example −
∏subject, author (Books)
Selects and projects columns named as subject and author from the relation
Books.
Union Operation (∪)
It performs binary union between two given relations and is defined as −
r ∪ s = { t | t ∈ r or t ∈ s}
Notation − r U s
Where r and s are either database relations or relation result set (temporary
relation).
For a union operation to be valid, the following conditions must hold −
 r, and s must have the same number of attributes.
 Attribute domains must be compatible.
 Duplicate tuples are automatically eliminated.
∏ author (Books) ∪ ∏ author (Articles)
Output − Projects the names of the authors who have either written a book
or an article or both.
Set Difference (−)
The result of set difference operation is tuples, which are present in one
relation but are not in the second relation.
46

Notation − r − s
Finds all the tuples that are present in r but not in s.
∏ author (Books) − ∏ author (Articles)
Output − Provides the name of authors who have written books but not
articles.
Cartesian Product (Χ)
Combines information of two different relations into one.
Notation − r Χ s
Where r and s are relations and their output will be defined as −
r Χ s = { q t | q ∈ r and t ∈ s}
σauthor = 'tutorialspoint'(Books Χ Articles)
Output − Yields a relation, which shows all the books and articles written by
tutorialspoint.
Rename Operation (ρ)
The results of relational algebra are also relations but without any name. The
rename operation allows us to rename the output relation. 'rename' operation
is denoted with small Greek letter rho ρ.
Notation − ρ x (E)
Where the result of expression E is saved with name of x.
Additional operations are −
 Set intersection
 Assignment
 Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a non-procedural
query language, that is, it tells what to do but never explains how to do it.
Relational calculus exists in two forms −

Tuple Relational Calculus (TRC)


Filtering variable ranges over tuples
47

Notation − {T | Condition}
Returns all tuples T that satisfies a condition.
For example −
{ T.name | Author(T) AND T.article = 'database' }
Output − Returns tuples with 'name' from Author who has written article on
'database'.
TRC can be quantified. We can use Existential (∃) and Universal Quantifiers
(∀).
For example −
{ R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)}

Output − The above query will yield the same result as the previous one.
Domain Relational Calculus (DRC)
In DRC, the filtering variable uses the domain of attributes instead of entire
tuple values (as done in TRC, mentioned above).
Notation −
{ a1, a2, a3, ..., an | P (a1, a2, a3, ... ,an)}
Where a1, a2 are attributes and P stands for formulae built by inner
attributes.
For example −
{< article, page, subject > | ∈ TutorialsPoint ∧ subject = 'database'}
Output − Yields Article, Page, and Subject from the relation TutorialsPoint,
where subject is database.
Just like TRC, DRC can also be written using existential and universal
quantifiers. DRC also involves relational operators.
The expression power of Tuple Relation Calculus and Domain Relation
Calculus is equivalent to Relational Algebra.

SQL
48

o SQL stands for Structured Query Language. It is used for storing and
managing data in relational database management system (RDMS).
o It is a standard language for Relational Database System. It enables a
user to create, read, update and delete relational databases and tables.
o All the RDBMS like MySQL, Informix, Oracle, MS Access and SQL Server
use SQL as their standard database language.
o SQL allows users to query the database in a number of ways, using
English-like statements.

Rules:

SQL follows the following rules:

o Structure query language is not case sensitive. Generally, keywords of


SQL are written in uppercase.
o Statements of SQL are dependent on text lines. We can use a single
SQL statement on one or multiple text line.
o Using the SQL statements, you can perform most of the actions in a
database.
o SQL depends on tuple relational calculus and relational algebra.

SQL process:
o When an SQL command is executing for any RDBMS, then the system
figure out the best way to carry out the request and the SQL engine
determines that how to interpret the task.
o In the process, various components are included. These components
can be optimization Engine, Query engine, Query dispatcher, classic,
etc.
o All the non-SQL queries are handled by the classic query engine, but
SQL query engine won't handle logical files.
49

Characteristics of SQL

o SQL is easy to learn.


o SQL is used to access data from relational database management
systems.
o SQL can execute queries against the database.
o SQL is used to describe the data.
o SQL is used to define the data in the database and manipulate it when
needed.
o SQL is used to create and drop the database and table.
o SQL is used to create a view, stored procedure, function in a database.
o SQL allows users to set permissions on tables, procedures, and views.

Advantages of SQL

There are the following advantages of SQL:

High speed

Using the SQL queries, the user can quickly and efficiently retrieve a large
amount of records from a database.
50

No coding needed

In the standard SQL, it is very easy to manage the database system. It doesn't
require a substantial amount of code to manage the database system.

Well defined standards

Long established are used by the SQL databases that are being used by ISO
and ANSI.

Portability

SQL can be used in laptop, PCs, server and even some mobile phones.

Interactive language

SQL is a domain language used to communicate with the database. It is also


used to receive answers to the complex questions in seconds.

Multiple data view

Using the SQL language, the users can make different views of the database
structure.

SQL Datatype

o SQL Datatype is used to define the values that a column can contain.
o Every column is required to have a name and data type in the database
table.
51

Datatype of SQL:

1. Binary Datatypes

There are Three types of binary Datatypes which are given below:

Data Description
Type

binary It has a maximum length of 8000 bytes. It contains fixed-length


binary data.

varbinary It has a maximum length of 8000 bytes. It contains variable-length


binary data.

image It has a maximum length of 2,147,483,647 bytes. It contains variable-


length binary data.

2. Approximate Numeric Datatype :

The subtypes are given below:

Data From To Description


type
52

float -1.79E + 1.79E + It is used to specify a floating-point value


308 308 e.g. 6.2, 2.9 etc.

real -3.40e + 3.40E + It specifies a single precision floating point


38 38 number

3. Exact Numeric Datatype

The subtypes are given below:

Data type Description

int It is used to specify an integer value.

smallint It is used to specify small integer value.

bit It has the number of bits to store.

decimal It specifies a numeric value that can have a decimal number.

numeric It is used to specify a numeric value.

4. Character String Datatype

The subtypes are given belowdeo

Data Description
type

char It has a maximum length of 8000 characters. It contains Fixed-length


non-unicode characters.
53

varchar It has a maximum length of 8000 characters. It contains variable-


length non-unicode characters.

text It has a maximum length of 2,147,483,647 characters. It contains


variable-length non-unicode characters.

5. Date and time Datatypes

The subtypes are given below:

Datatype Description

date It is used to store the year, month, and days value.

time It is used to store the hour, minute, and second values.

timestamp It stores the year, month, day, hour, minute, and the second value.

SQL Commands

o SQL commands are instructions. It is used to communicate with the


database. It is also used to perform specific tasks, functions, and
queries of data.
o SQL can perform various tasks like create a table, add data to tables,
drop the table, modify the table, set permission for users.

Types of SQL Commands

There are five types of SQL commands: DDL, DML, DCL, TCL, and DQL.
54

1. Data Definition Language (DDL)


o DDL changes the structure of the table like creating a table, deleting a
table, altering a table, etc.
o All the command of DDL are auto-committed that means it
permanently save all the changes in the database.

Here are some commands that come under DDL:

o CREATE
o ALTER
o DROP
o TRUNCATE

a. CREATE It is used to create a new table in the database.

Syntax:

1. CREATE TABLE TABLE_NAME (COLUMN_NAME DATATYPES[,....]);

Example:

1. CREATE TABLE EMPLOYEE(Name VARCHAR2(20), Email VARCHAR2(100), D


OB DATE);
55

b. DROP: It is used to delete both the structure and record stored in the
table.

Syntax

1. DROP TABLE table_name;

Example

1. DROP TABLE EMPLOYEE;

c. ALTER: It is used to alter the structure of the database. This change could
be either to modify the characteristics of an existing attribute or probably to
add a new attribute.

Syntax:

To add a new column in the table

1. ALTER TABLE table_name ADD column_name COLUMN-definition;

To modify existing column in the table:

1. ALTER TABLE table_name MODIFY(column_definitions....);

EXAMPLE

1. ALTER TABLE STU_DETAILS ADD(ADDRESS VARCHAR2(20));


2. ALTER TABLE STU_DETAILS MODIFY (NAME VARCHAR2(20));

d. TRUNCATE: It is used to delete all the rows from the table and free the
space containing the table.

Syntax:

1. TRUNCATE TABLE table_name;

Example:
56

1. TRUNCATE TABLE EMPLOYEE;


2. Data Manipulation Language
o DML commands are used to modify the database. It is responsible for
all form of changes in the database.
o The command of DML is not auto-committed that means it can't
permanently save all the changes in the database. They can be rollback.

Here are some commands that come under DML:

o INSERT
o UPDATE
o DELETE

a. INSERT: The INSERT statement is a SQL query. It is used to insert data into
the row of a table.

Syntax:

1. INSERT INTO TABLE_NAME


2. (col1, col2, col3,.... col N)
3. VALUES (value1, value2, value3, .... valueN);

Or

1. INSERT INTO TABLE_NAME


2. VALUES (value1, value2, value3, .... valueN);

For example:

1. INSERT INTO javatpoint (Author, Subject) VALUES ("Sonoo", "DBMS");

b. UPDATE: This command is used to update or modify the value of a


column in the table.

Syntax:
57

1. UPDATE table_name SET [column_name1= value1,...column_nameN = value


N] [WHERE CONDITION]

For example:

1. UPDATE students
2. SET User_Name = 'Sonoo'
3. WHERE Student_Id = '3'

c. DELETE: It is used to remove one or more row from a table.

Syntax:

1. DELETE FROM table_name [WHERE condition];

For example:

1. DELETE FROM javatpoint


2. WHERE Author="Sonoo";
3. Data Control Language

DCL commands are used to grant and take back authority from any database
user.

Here are some commands that come under DCL:

o Grant
o Revoke

a. Grant: It is used to give user access privileges to a database.

Example

1. GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER;

b. Revoke: It is used to take back permissions from the user.

Example
58

1. REVOKE SELECT, UPDATE ON MY_TABLE FROM USER1, USER2;


4. Transaction Control Language

TCL commands can only use with DML commands like INSERT, DELETE and
UPDATE only.

These operations are automatically committed in the database that's why


they cannot be used while creating tables or dropping them.

Here are some commands that come under TCL:

o COMMIT
o ROLLBACK
o SAVEPOINT

a. Commit: Commit command is used to save all the transactions to the


database.

Syntax:

1. COMMIT;

Example:

1. DELETE FROM CUSTOMERS


2. WHERE AGE = 25;
3. COMMIT;

b. Rollback: Rollback command is used to undo transactions that have not


already been saved to the database.

Syntax:

1. ROLLBACK;

Example:
59

1. DELETE FROM CUSTOMERS


2. WHERE AGE = 25;
3. ROLLBACK;

c. SAVEPOINT: It is used to roll the transaction back to a certain point


without rolling back the entire transaction.

Syntax:

1. SAVEPOINT SAVEPOINT_NAME;
5. Data Query Language

DQL is used to fetch the data from the database.

It uses only one command:

o SELECT

a. SELECT: This is the same as the projection operation of relational algebra.


It is used to select the attribute based on the condition described by WHERE
clause.

Syntax:

1. SELECT expressions
2. FROM TABLES
3. WHERE conditions;

For example:

1. SELECT emp_name
2. FROM employee
3. WHERE age > 20;

SQL Operator

There are various types of SQL operator:


60

SQL Arithmetic Operators

Let's assume 'variable a' and 'variable b'. Here, 'a' contains 20 and 'b' contains
10.

Operator Description Example

+ It adds the value of both operands. a+b will give


30

- It is used to subtract the right-hand operand from the a-b will give
left-hand operand. 10

* It is used to multiply the value of both operands. a*b will give


200

/ It is used to divide the left-hand operand by the right- a/b will give
hand operand. 2

% It is used to divide the left-hand operand by the right- a%b will give
hand operand and returns reminder. 0
61

SQL Comparison Operators:

Let's assume 'variable a' and 'variable b'. Here, 'a' contains 20 and 'b' contains
10.

Operator Description Example

= It checks if two operands values are equal or not, if the (a=b) is not
values are queal then condition becomes true. true

!= It checks if two operands values are equal or not, if (a!=b) is


values are not equal, then condition becomes true. true

<> It checks if two operands values are equal or not, if (a<>b) is


values are not equal then condition becomes true. true

> It checks if the left operand value is greater than right (a>b) is not
operand value, if yes then condition becomes true. true

< It checks if the left operand value is less than right (a<b) is true
operand value, if yes then condition becomes true.

>= It checks if the left operand value is greater than or (a>=b) is


equal to the right operand value, if yes then condition not true
becomes true.

<= It checks if the left operand value is less than or equal (a<=b) is
to the right operand value, if yes then condition true
becomes true.
62

!< It checks if the left operand value is not less than the (a!=b) is not
right operand value, if yes then condition becomes true
true.

!> It checks if the left operand value is not greater than (a!>b) is
the right operand value, if yes then condition true
becomes true.

SQL Logical Operators

There is the list of logical operator used in SQL:

Operator Description

ALL It compares a value to all values in another value set.

AND It allows the existence of multiple conditions in an SQL statement.

ANY It compares the values in the list according to the condition.

BETWEEN It is used to search for values that are within a set of values.

IN It compares a value to that specified list value.

NOT It reverses the meaning of any logical operator.

OR It combines multiple conditions in SQL statements.

EXISTS It is used to search for the presence of a row in a specified table.

LIKE It compares a value to similar values using wildcard operator.


63

SQL Table

o SQL Table is a collection of data which is organized in terms of rows


and columns. In DBMS, the table is known as relation and row as a
tuple.
o Table is a simple form of data storage. A table is also considered as a
convenient representation of relations.

Let's see an example of the EMPLOYEE table:

EMP_ID EMP_NAME CITY PHONE_NO

1 Kristen Washington 7289201223

2 Anna Franklin 9378282882

3 Jackson Bristol 9264783838

4 Kellan California 7254728346

5 Ashley Hawaii 9638482678

In the above table, "EMPLOYEE" is the table name, "EMP_ID", "EMP_NAME",


"CITY", "PHONE_NO" are the column names. The combination of data of
multiple columns forms a row, e.g., 1, "Kristen", "Washington" and
7289201223 are the data of one row.

Operation on Table
1. Create table
2. Drop table
3. Delete table
64

4. Rename table

SQL Create Table

SQL create table is used to create a table in the database. To define the table,
you should define the name of the table and also define its columns and
column's data type.

Syntax

1. create table "table_name"


2. ("column1" "data type",
3. "column2" "data type",
4. "column3" "data type",
5. ...
6. "columnN" "data type");

Example

1. SQL> CREATE TABLE EMPLOYEE (


2. EMP_ID INT NOT NULL,
3. EMP_NAME VARCHAR (25) NOT NULL,
4. PHONE_NO INT NOT NULL,
5. ADDRESS CHAR (30),
6. PRIMARY KEY (ID)
7. );

If you create the table successfully, you can verify the table by looking at the
message by the SQL server. Else you can use DESC command as follows:

SQL> DESC EMPLOYEE;

Field Type Null Key Default Extra

EMP_ID int(11) NO PRI NULL


65

EMP_NAME varchar(25) NO NULL

PHONE_NO NO int(11) NULL

ADDRESS YES NULL char(30)

o 4 rows in set (0.35 sec)

Now you have an EMPLOYEE table in the database, and you can use the
stored information related to the employees.

Drop table

A SQL drop table is used to delete a table definition and all the data from a
table. When this command is executed, all the information available in the
table is lost forever, so you have to very careful while using this command.

Syntax

1. DROP TABLE "table_name";

Firstly, you need to verify the EMPLOYEE table using the following
command:

1. SQL> DESC EMPLOYEE;


Field Type Null Key Default Extra

EMP_ID int(11) NO PRI NULL

EMP_NAME varchar(25) NO NULL

PHONE_NO NO int(11) NULL


66

ADDRESS YES NULL char(30)

o 4 rows in set (0.35 sec)

This table shows that EMPLOYEE table is available in the database, so we can
drop it as follows:

1. SQL>DROP TABLE EMPLOYEE;

Now, we can check whether the table exists or not using the following
command:

1. Query OK, 0 rows affected (0.01 sec)

As this shows that the table is dropped, so it doesn't display it.

SQL DELETE table

In SQL, DELETE statement is used to delete rows from a table. We can use
WHERE condition to delete a specific row from a table. If you want to delete
all the records from the table, then you don't need to use the WHERE clause.

Syntax

1. DELETE FROM table_name WHERE condition;

Example

Suppose, the EMPLOYEE table having the following records:

EMP_ID EMP_NAME CITY PHONE_NO SALARY

1 Kristen Chicago 9737287378 150000


67

2 Russell Austin 9262738271 200000

3 Denzel Boston 7353662627 100000

4 Angelina Denver 9232673822 600000

5 Robert Washington 9367238263 350000

6 Christian Los angels 7253847382 260000

The following query will DELETE an employee whose ID is 2.

1. SQL> DELETE FROM EMPLOYEE


2. WHERE EMP_ID = 3;

Now, the EMPLOYEE table would have the following records.

EMP_ID EMP_NAME CITY PHONE_NO SALARY

1 Kristen Chicago 9737287378 150000

2 Russell Austin 9262738271 200000

4 Angelina Denver 9232673822 600000

5 Robert Washington 9367238263 350000

6 Christian Los angels 7253847382 260000

If you don't specify the WHERE condition, it will remove all the rows from the
table.
68

1. DELETE FROM EMPLOYEE;

Now, the EMPLOYEE table would not have any records.

Views in SQL

o Views in SQL are considered as a virtual table. A view also contains rows
and columns.
o To create the view, we can select the fields from one or more tables
present in the database.
o A view can either have specific rows based on certain condition or all
the rows of a table.

Sample table:

Student_Detail

STU_ID NAME ADDRESS

1 Stephan Delhi

2 Kathrin Noida

3 David Ghaziabad

4 Alina Gurugram

Student_Marks

STU_ID NAME MARKS AGE

1 Stephan 97 19
69

2 Kathrin 86 21

3 David 74 18

4 Alina 90 20

5 John 96 18

1. Creating view

A view can be created using the CREATE VIEW statement. We can create a
view from a single table or multiple tables.

Syntax: create VIEW view_name AS

1. SELECT column1, column2.....


2. FROM table_name
3. WHERE condition;
2. Creating View from a single table

In this example, we create a View named DetailsView from the table


Student_Detail.

Query:

1. CREATE VIEW DetailsView AS


2. SELECT NAME, ADDRESS
3. FROM Student_Details
4. WHERE STU_ID < 4;

Just like table query, we can query the view to view the data.

1. SELECT * FROM DetailsView;

Output:
70

NAME ADDRESS

Stephan Delhi

Kathrin Noida

David Ghaziabad

3. Creating View from multiple tables

View from multiple tables can be created by simply include multiple tables
in the SELECT statement.

In the given example, a view is created named MarksView from two tables
Student_Detail and Student_Marks.

Query:

1. CREATE VIEW MarksView AS


2. SELECT Student_Detail.NAME, Student_Detail.ADDRESS, Student_Marks.MA
RKS
3. FROM Student_Detail, Student_Mark
4. WHERE Student_Detail.NAME = Student_Marks.NAME;

To display data of View MarksView:

1. SELECT * FROM MarksView;


NAME ADDRESS MARKS

Stephan Delhi 97

Kathrin Noida 86
71

David Ghaziabad 74

Alina Gurugram 90

4. Deleting View

A view can be deleted using the Drop View statement.

Syntax

1. DROP VIEW view_name;

Example:

If we want to delete the View MarksView, we can do this as:

1. DROP VIEW MarksView;

SQL Index

o Indexes are special lookup tables. It is used to retrieve data from the
database very fast.
o An Index is used to speed up select queries and where clauses. But it
shows down the data input with insert and update statements. Indexes
can be created or dropped without affecting the data.
o An index in a database is just like an index in the back of a book.
o For example: When you reference all pages in a book that discusses a
certain topic, you first have to refer to the index, which alphabetically
lists all the topics and then referred to one or more specific page
numbers.

1. Create Index statement

It is used to create an index on a table. It allows duplicate value.

Syntax
72

1. CREATE INDEX index_name


2. ON table_name (column1, column2, ...);

Example

1. CREATE INDEX idx_name


2. ON Persons (LastName, FirstName);
2. Unique Index statement

It is used to create a unique index on a table. It does not allow duplicate


value.

Syntax

1. CREATE UNIQUE INDEX index_name


2. ON table_name (column1, column2, ...);

Example

1. CREATE UNIQUE INDEX websites_idx


2. ON websites (site_name);
3. Drop Index Statement

It is used to delete an index in a table.

Syntax

1. DROP INDEX index_name;

Example

1. DROP INDEX websites_idx;


SQL Sub Query

A Subquery is a query within another SQL query and embedded within the
WHERE clause.

Important Rule:
73

o A subquery can be placed in a number of SQL clauses like WHERE


clause, FROM clause, HAVING clause.
o You can use Subquery with SELECT, UPDATE, INSERT, DELETE
statements along with the operators like =, <, >, >=, <=, IN, BETWEEN,
etc.
o A subquery is a query within another query. The outer query is known
as the main query, and the inner query is known as a subquery.
o Subqueries are on the right side of the comparison operator.
o A subquery is enclosed in parentheses.
o In the Subquery, ORDER BY command cannot be used. But GROUP BY
command can be used to perform the same function as ORDER BY
command.

1. Subqueries with the Select Statement

SQL subqueries are most frequently used with the Select statement.

Syntax

1. SELECT column_name
2. FROM table_name
3. WHERE column_name expression operator
4. ( SELECT column_name from table_name WHERE ... );

Example

Consider the EMPLOYEE table have the following records:

ID NAME AGE ADDRESS SALARY

1 John 20 US 2000.00

2 Stephan 26 Dubai 1500.00


74

3 David 27 Bangkok 2000.00

4 Alina 29 UK 6500.00

5 Kathrin 34 Bangalore 8500.00

6 Harry 42 China 4500.00

7 Jackson 25 Mizoram 10000.00

The subquery with a SELECT statement will be:

1. SELECT *
2. FROM EMPLOYEE
3. WHERE ID IN (SELECT ID
4. FROM EMPLOYEE
5. WHERE SALARY > 4500);

This would produce the following result:

ID NAME AGE ADDRESS SALARY

4 Alina 29 UK 6500.00

5 Kathrin 34 Bangalore 8500.00

7 Jackson 25 Mizoram 10000.00

2. Subqueries with the INSERT Statement


o SQL subquery can also be used with the Insert statement. In the insert
statement, data returned from the subquery is used to insert into
another table.
75

o In the subquery, the selected data can be modified with any of the
character, date functions.

Syntax:

1. INSERT INTO table_name (column1, column2, column3....)


2. SELECT *
3. FROM table_name
4. WHERE VALUE OPERATOR

Example

Consider a table EMPLOYEE_BKP with similar as EMPLOYEE.

Now use the following syntax to copy the complete EMPLOYEE table into the
EMPLOYEE_BKP table.

1. INSERT INTO EMPLOYEE_BKP


2. SELECT * FROM EMPLOYEE
3. WHERE ID IN (SELECT ID
4. FROM EMPLOYEE);
3. Subqueries with the UPDATE Statement

The subquery of SQL can be used in conjunction with the Update statement.
When a subquery is used with the Update statement, then either single or
multiple columns in a table can be updated.

Syntax

1. UPDATE table
2. SET column_name = new_value
3. WHERE VALUE OPERATOR
4. (SELECT COLUMN_NAME
5. FROM TABLE_NAME
6. WHERE condition);

Example
76

Let's assume we have an EMPLOYEE_BKP table available which is backup of


EMPLOYEE table. The given example updates the SALARY by .25 times in the
EMPLOYEE table for all employee whose AGE is greater than or equal to 29.

1. UPDATE EMPLOYEE
2. SET SALARY = SALARY * 0.25
3. WHERE AGE IN (SELECT AGE FROM CUSTOMERS_BKP
4. WHERE AGE >= 29);

This would impact three rows, and finally, the EMPLOYEE table would have
the following records.

ID NAME AGE ADDRESS SALARY

1 John 20 US 2000.00

2 Stephan 26 Dubai 1500.00

3 David 27 Bangkok 2000.00

4 Alina 29 UK 1625.00

5 Kathrin 34 Bangalore 2125.00

6 Harry 42 China 1125.00

7 Jackson 25 Mizoram 10000.00

4. Subqueries with the DELETE Statement

The subquery of SQL can be used in conjunction with the Delete statement
just like any other statements mentioned above.

Syntax
77

1. DELETE FROM TABLE_NAME


2. WHERE VALUE OPERATOR
3. (SELECT COLUMN_NAME
4. FROM TABLE_NAME
5. WHERE condition);

Example

Let's assume we have an EMPLOYEE_BKP table available which is backup of


EMPLOYEE table. The given example deletes the records from the EMPLOYEE
table for all EMPLOYEE whose AGE is greater than or equal to 29.

1. DELETE FROM EMPLOYEE


2. WHERE AGE IN (SELECT AGE FROM EMPLOYEE_BKP
3. WHERE AGE >= 29 );

This would impact three rows, and finally, the EMPLOYEE table would have
the following records.

ID NAME AGE ADDRESS SALARY

1 John 20 US 2000.00

2 Stephan 26 Dubai 1500.00

3 David 27 Bangkok 2000.00

7 Jackson 25 Mizoram 10000.00

SQL Aggregate Functions

o SQL aggregation function is used to perform the calculations on


multiple rows of a single column of a table. It returns a single value.
78

o It is also used to summarize the data.

Types of SQL Aggregation Function

1. COUNT FUNCTION
o COUNT function is used to Count the number of rows in a database
table. It can work on both numeric and non-numeric data types.
o COUNT function uses the COUNT(*) that returns the count of all the
rows in a specified table. COUNT(*) considers duplicate and Null.

Syntax

1. COUNT(*)
2. or
3. COUNT( [ALL|DISTINCT] expression )

Sample table:

PRODUCT_MAST

PRODUCT COMPANY QTY RATE COST


79

Item1 Com1 2 10 20

Item2 Com2 3 25 75

Item3 Com1 2 30 60

Item4 Com3 5 10 50

Item5 Com2 2 20 40

Item6 Cpm1 3 25 75

Item7 Com1 5 30 150

Item8 Com1 3 10 30

Item9 Com2 2 25 50

Item10 Com3 4 30 120

Example: COUNT()

1. SELECT COUNT(*)
2. FROM PRODUCT_MAST;

Output:

10

Example: COUNT with WHERE

1. SELECT COUNT(*)
2. FROM PRODUCT_MAST;
80

3. WHERE RATE>=20;

Output:

Example: COUNT() with DISTINCT

1. SELECT COUNT(DISTINCT COMPANY)


2. FROM PRODUCT_MAST;

Output:

Example: COUNT() with GROUP BY

1. SELECT COMPANY, COUNT(*)


2. FROM PRODUCT_MAST
3. GROUP BY COMPANY;

Output:

Com1 5
Com2 3
Com3 2

Example: COUNT() with HAVING

1. SELECT COMPANY, COUNT(*)


2. FROM PRODUCT_MAST
3. GROUP BY COMPANY
4. HAVING COUNT(*)>2;

Output:

Com1 5
Com2 3
81

2. SUM Function

Sum function is used to calculate the sum of all selected columns. It works
on numeric fields only.

Syntax

1. SUM()
2. or
3. SUM( [ALL|DISTINCT] expression )

Example: SUM()

1. SELECT SUM(COST)
2. FROM PRODUCT_MAST;

Output:

670

Example: SUM() with WHERE

1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3;

Output:

320

Example: SUM() with GROUP BY

1. SELECT SUM(COST)
2. FROM PRODUCT_MAST
3. WHERE QTY>3
4. GROUP BY COMPANY;

Output:
82

Com1 150
Com2 170

Example: SUM() with HAVING

1. SELECT COMPANY, SUM(COST)


2. FROM PRODUCT_MAST
3. GROUP BY COMPANY
4. HAVING SUM(COST)>=170;

Output:

Com1 335
Com3 170
3. AVG function

The AVG function is used to calculate the average value of the numeric type.
AVG function returns the average of all non-Null values.

Syntax

1. AVG()
2. or
3. AVG( [ALL|DISTINCT] expression )

Example:

1. SELECT AVG(COST)
2. FROM PRODUCT_MAST;

Output:

67.00
4. MAX Function

MAX function is used to find the maximum value of a certain column. This
function determines the largest value of all selected values of a column.
83

Syntax

1. MAX()
2. or
3. MAX( [ALL|DISTINCT] expression )

Example:

1. SELECT MAX(RATE)
2. FROM PRODUCT_MAST;
30
5. MIN Function

MIN function is used to find the minimum value of a certain column. This
function determines the smallest value of all selected values of a column.

Syntax

1. MIN()
2. or
3. MIN( [ALL|DISTINCT] expression )

Example:

1. SELECT MIN(RATE)
2. FROM PRODUCT_MAST;

Output:

10

SQL JOIN

As the name shows, JOIN means to combine something. In case of SQL, JOIN
means "to combine two or more tables".
84

In SQL, JOIN clause is used to combine the records from two or more tables
in a database.

Types of SQL JOIN


1. INNER JOIN
2. LEFT JOIN
3. RIGHT JOIN
4. FULL JOIN

Sample Table

EMPLOYEE

EMP_ID EMP_NAME CITY SALARY AGE

1 Angelina Chicago 200000 30

2 Robert Austin 300000 26

3 Christian Denver 100000 42

4 Kristen Washington 500000 29

5 Russell Los angels 200000 36

6 Marry Canada 600000 48

PROJECT

PROJECT_NO EMP_ID DEPARTMENT


85

101 1 Testing

102 2 Development

103 3 Designing

104 4 Development

1. INNER JOIN

In SQL, INNER JOIN selects records that have matching values in both tables
as long as the condition is satisfied. It returns the combination of all rows
from both the tables where the condition satisfies.

Syntax

1. SELECT table1.column1, table1.column2, table2.column1,....


2. FROM table1
3. INNER JOIN table2
4. ON table1.matching_column = table2.matching_column;

Query

1. SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT


2. FROM EMPLOYEE
3. INNER JOIN PROJECT
4. ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;

Output

EMP_NAME DEPARTMENT

Angelina Testing
86

Robert Development

Christian Designing

Kristen Development

2. LEFT JOIN

The SQL left join returns all the values from left table and the matching values
from the right table. If there is no matching join value, it will return NULL.

Syntax

1. SELECT table1.column1, table1.column2, table2.column1,....


2. FROM table1
3. LEFT JOIN table2
4. ON table1.matching_column = table2.matching_column;

Query

1. SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT


2. FROM EMPLOYEE
3. LEFT JOIN PROJECT
4. ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;

Output

EMP_NAME DEPARTMENT

Angelina Testing

Robert Development
87

Christian Designing

Kristen Development

Russell NULL

Marry NULL

3. RIGHT JOIN

In SQL, RIGHT JOIN returns all the values from the values from the rows of
right table and the matched values from the left table. If there is no matching
in both tables, it will return NULL.

Syntax

1. SELECT table1.column1, table1.column2, table2.column1,....


2. FROM table1
3. RIGHT JOIN table2
4. ON table1.matching_column = table2.matching_column;

Query

1. SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT


2. FROM EMPLOYEE
3. RIGHT JOIN PROJECT
4. ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;

Output

EMP_NAME DEPARTMENT

Angelina Testing
88

Robert Development

Christian Designing

Kristen Development

4. FULL JOIN

In SQL, FULL JOIN is the result of a combination of both left and right outer
join. Join tables have all the records from both tables. It puts NULL on the
place of matches not found.

Syntax

1. SELECT table1.column1, table1.column2, table2.column1,....


2. FROM table1
3. FULL JOIN table2
4. ON table1.matching_column = table2.matching_column;

Query

1. SELECT EMPLOYEE.EMP_NAME, PROJECT.DEPARTMENT


2. FROM EMPLOYEE
3. FULL JOIN PROJECT
4. ON PROJECT.EMP_ID = EMPLOYEE.EMP_ID;

Output

EMP_NAME DEPARTMENT

Angelina Testing

Robert Development
89

Christian Designing

Kristen Development

Russell NULL

Marry NULL

transaction Processing Systems


Transactions
A transaction is a program including a collection of database operations,
executed as a logical unit of data processing. The operations performed in a
transaction include one or more of database operations like insert, delete,
update or retrieve data. It is an atomic process that is either performed into
completion entirely or is not performed at all. A transaction involving only
data retrieval without any data update is called read-only transaction.
Each high level operation can be divided into a number of low level tasks or
operations. For example, a data update operation can be divided into three
tasks −
 read_item() − reads data item from storage to main memory.
 modify_item() − change value of item in the main memory.
 write_item() − write the modified value from main memory to storage.
Database access is restricted to read_item() and write_item() operations.
Likewise, for all transactions, read and write forms the basic database
operations.
Transaction Operations
The low level operations performed in a transaction are −
 begin_transaction − A marker that specifies start of transaction
execution.
 read_item or write_item − Database operations that may be
interleaved with main memory operations as a part of transaction.
 end_transaction − A marker that specifies end of transaction.
90

 commit − A signal to specify that the transaction has been successfully


completed in its entirety and will not be undone.
 rollback − A signal to specify that the transaction has been
unsuccessful and so all temporary changes in the database are
undone. A committed transaction cannot be rolled back.
Transaction States
A transaction may go through a subset of five states, active, partially
committed, committed, failed and aborted.
 Active − The initial state where the transaction enters is the active
state. The transaction remains in this state while it is executing read,
write or other operations.
 Partially Committed − The transaction enters this state after the last
statement of the transaction has been executed.
 Committed − The transaction enters this state after successful
completion of the transaction and system checks have issued commit
signal.
 Failed − The transaction goes from partially committed state or active
state to failed state when it is discovered that normal execution can no
longer proceed or system checks fail.
 Aborted − This is the state after the transaction has been rolled back
after failure and the database has been restored to its state that was
before the transaction began.
The following state transition diagram depicts the states in the transaction
and the low level transaction operations that causes change in states.
91

Desirable Properties of Transactions


Any transaction must maintain the ACID properties, viz. Atomicity,
Consistency, Isolation, and Durability.
 Atomicity − This property states that a transaction is an atomic unit of
processing, that is, either it is performed in its entirety or not performed
at all. No partial update should exist.
 Consistency − A transaction should take the database from one
consistent state to another consistent state. It should not adversely
affect any data item in the database.
 Isolation − A transaction should be executed as if it is the only one in
the system. There should not be any interference from the other
concurrent transactions that are simultaneously running.
 Durability − If a committed transaction brings about a change, that
change should be durable in the database and not lost in case of any
failure.
Schedules and Conflicts
In a system with a number of simultaneous transactions, a schedule is the
total order of execution of operations. Given a schedule S comprising of n
transactions, say T1, T2, T3………..Tn; for any transaction Ti, the operations
in Ti must execute as laid down in the schedule S.
Types of Schedules
There are two types of schedules −
 Serial Schedules − In a serial schedule, at any point of time, only one
transaction is active, i.e. there is no overlapping of transactions. This is
depicted in the following graph −

 Parallel Schedules − In parallel schedules, more than one


transactions are active simultaneously, i.e. the transactions contain
92

operations that overlap at time. This is depicted in the following graph


Conflicts in Schedules
In a schedule comprising of multiple transactions, a conflict occurs when
two active transactions perform non-compatible operations. Two operations
are said to be in conflict, when all of the following three conditions exists
simultaneously −
 The two operations are parts of different transactions.
 Both the operations access the same data item.
 At least one of the operations is a write_item() operation, i.e. it tries to
modify the data item.
Serializability
A serializable schedule of ‘n’ transactions is a parallel schedule which is
equivalent to a serial schedule comprising of the same ‘n’ transactions. A
serializable schedule contains the correctness of serial schedule while
ascertaining better CPU utilization of parallel schedule.
Equivalence of Schedules
Equivalence of two schedules can be of the following types −
 Result equivalence − Two schedules producing identical results are
said to be result equivalent.
 View equivalence − Two schedules that perform similar action in a
similar manner are said to be view equivalent.
 Conflict equivalence − Two schedules are said to be conflict
equivalent if both contain the same set of transactions and has the
same order of conflicting pairs of operations.

Concurrency
93

Data concurrency is the ability to allow multiple users to affect multiple


transaction within a database. Simply, data concurrency allows multiple
users to access data all at the same time.

The ability to offer concurrency is unique to databases. Almost all


databases deal with concurrency the same way., with the general principle
being that the changed, but unsaved data is held in a type of temporary log
or file. Once the data is saved, it is then written into the database’s physical
storage in place of the original data.

concurrency control
Concurrency control concept comes under the Transaction in database
management system (DBMS). It is a procedure in DBMS which helps us for
the management of two simultaneous processes to execute without conflicts
between each other, these conflicts occur in multi user systems.
Concurrency can simply be said to be executing multiple transactions at a
time. It is required to increase time efficiency. If many transactions try to
access the same data, then inconsistency arises. Concurrency control
required to maintain consistency data.
For example, if we take ATM machines and do not use concurrency, multiple
persons cannot draw money at a time in different places. This is where we
need concurrency.
Advantages
The advantages of concurrency control are as follows −
 Waiting time will be decreased.
 Response time will decrease.
 Resource utilization will increase.
 System performance & Efficiency is increased.
Control concurrency
The simultaneous execution of transactions over shared databases can
create several data integrity and consistency problems.
For example, if too many people are logging in the ATM machines, serial
updates and synchronization in the bank servers should happen whenever
the transaction is done, if not it gives wrong information and wrong data in
the database.
94

Main problems in using Concurrency


The problems which arise while using concurrency are as follows −
 Updates will be lost − One transaction does some changes and
another transaction deletes that change. One transaction nullifies the
updates of another transaction.
 Uncommitted Dependency or dirty read problem − On variable has
updated in one transaction, at the same time another transaction has
started and deleted the value of the variable there the variable is not
getting updated or committed that has been done on the first
transaction this gives us false values or the previous values of the
variables this is a major problem.
 Inconsistent retrievals − One transaction is updating multiple
different variables, another transaction is in a process to update those
variables, and the problem occurs is inconsistency of the same variable
in different instances.
Concurrency control techniques
The concurrency control techniques are as follows −
Locking
Lock guaranties exclusive use of data items to a current transaction. It first
accesses the data items by acquiring a lock, after completion of the
transaction it releases the lock.
Types of Locks
The types of locks are as follows −
 Shared Lock [Transaction can read only the data item values]
 Exclusive Lock [Used for both read and write data item values]
Time Stamping
Time stamp is a unique identifier created by DBMS that indicates relative
starting time of a transaction. Whatever transaction we are doing it stores
the starting time of the transaction and denotes a specific time.
This can be generated using a system clock or logical counter. This can be
started whenever a transaction is started. Here, the logical counter is
incremented after a new timestamp has been assigned.
95

Optimistic
It is based on the assumption that conflict is rare and it is more efficient to
allow transactions to proceed without imposing delays to ensure
serializability.

Recovery Techniques in DBMS


Database systems, like any other computer system, are subject to failures
but the data stored in them must be available as and when required. When
a database fails it must possess the facilities for fast recovery. It must also
have atomicity i.e. either transaction are completed successfully and
committed (the effect is recorded permanently in the database) or the
transaction should have no effect on the database. There are both
automatic and non-automatic ways for both, backing up of data and
recovery from any failure situations. The techniques used to recover the
lost data due to system crashes, transaction errors, viruses, catastrophic
failure, incorrect commands execution, etc. are database recovery
techniques. So to prevent data loss recovery techniques based on deferred
update and immediate update or backing up data can be used. Recovery
techniques are heavily dependent upon the existence of a special file
known as a system log. It contains information about the start and end of
each transaction and any updates which occur during the transaction. The
log keeps track of all transaction operations that affect the values of
database items. This information is needed to recover from transaction
failure.
 The log is kept on disk start_transaction(T): This log entry records that
transaction T starts the execution.
 read_item(T, X): This log entry records that transaction T reads the value
of database item X.
 write_item(T, X, old_value, new_value): This log entry records that
transaction T changes the value of the database item X from old_value
to new_value. The old value is sometimes known as a before an image
of X, and the new value is known as an afterimage of X.
 commit(T): This log entry records that transaction T has completed all
accesses to the database successfully and its effect can be committed
(recorded permanently) to the database.
 abort(T): This records that transaction T has been aborted.
 checkpoint: Checkpoint is a mechanism where all the previous logs are
removed from the system and stored permanently in a storage disk.
96

Checkpoint declares a point before which the DBMS was in a consistent


state, and all the transactions were committed.
A transaction T reaches its commit point when all its operations that
access the database have been executed successfully i.e. the transaction
has reached the point at which it will not abort (terminate without
completing). Once committed, the transaction is permanently recorded in
the database. Commitment always involves writing a commit entry to the
log and writing the log to disk. At the time of a system crash, item is
searched back in the log for all transactions T that have written a
start_transaction(T) entry into the log but have not written a commit(T)
entry yet; these transactions may have to be rolled back to undo their
effect on the database during the recovery process.
 Undoing – If a transaction crashes, then the recovery manager may
undo transactions i.e. reverse the operations of a transaction. This
involves examining a transaction for the log entry write_item(T, x,
old_value, new_value) and set the value of item x in the database to
old-value. There are two major techniques for recovery from non-
catastrophic transaction failures: deferred updates and immediate
updates.
 Deferred update – This technique does not physically update the
database on disk until a transaction has reached its commit point.
Before reaching commit, all transaction updates are recorded in the
local transaction workspace. If a transaction fails before reaching its
commit point, it will not have changed the database in any way so
UNDO is not needed. It may be necessary to REDO the effect of the
operations that are recorded in the local transaction workspace,
because their effect may not yet have been written in the database.
Hence, a deferred update is also known as the No-undo/redo
algorithm
 Immediate update – In the immediate update, the database may be
updated by some operations of a transaction before the transaction
reaches its commit point. However, these operations are recorded in a
log on disk before they are applied to the database, making recovery
still possible. If a transaction fails to reach its commit point, the effect of
its operation must be undone i.e. the transaction must be rolled back
hence we require both undo and redo. This technique is known
as undo/redo algorithm.
 Caching/Buffering – In this one or more disk pages that include data
items to be updated are cached into main memory buffers and then
updated in memory before being written back to disk. A collection of in-
97

memory buffers called the DBMS cache is kept under the control of
DBMS for holding these buffers. A directory is used to keep track of
which database items are in the buffer. A dirty bit is associated with
each buffer, which is 0 if the buffer is not modified else 1 if modified.
 Shadow paging – It provides atomicity and durability. A directory with
n entries is constructed, where the ith entry points to the ith database
page on the link. When a transaction began executing the current
directory is copied into a shadow directory. When a page is to be
modified, a shadow page is allocated in which changes are made and
when it is ready to become durable, all pages that refer to the original
are updated to refer new replacement page.
 Backward Recovery – The term “Rollback ” and “UNDO” can also
refer to backward recovery. When a backup of the data is not available
and previous modifications need to be undone, this technique can be
helpful. With the backward recovery method, unused modifications are
removed and the database is returned to its prior condition. All
adjustments made during the previous traction are reversed during the
backward recovery. In another word, it reprocesses valid transactions
and undoes the erroneous database updates.
 Forward Recovery – “Roll forward “and “REDO” refers to forwarding
recovery. When a database needs to be updated with all changes
verified, this forward recovery technique is helpful.
Some failed transactions in this database are applied to the database
to roll those modifications forward. In another word, the database is
restored using preserved data and valid transactions counted by their
past saves.
Some of the backup techniques are as follows :
 Full database backup – In this full database including data and
database, Meta information needed to restore the whole database,
including full-text catalogs are backed up in a predefined time series.
 Differential backup – It stores only the data changes that have
occurred since the last full database backup. When some data has
changed many times since last full database backup, a differential
backup stores the most recent version of the changed data. For this
first, we need to restore a full database backup.
 Transaction log backup – In this, all events that have occurred in the
database, like a record of every single statement executed is backed
up. It is the backup of transaction log entries and contains all
transactions that had happened to the database. Through this, the
98

database can be recovered to a specific point in time. It is even


possible to perform a backup from a transaction log if the data files are
destroyed and not even a single committed transaction is lost.

What is Query Optimization?


Query optimization is of great importance for the performance of a relational
database, especially for the execution of complex SQL statements. A query
optimizer decides the best methods for implementing each query.
The query optimizer selects, for instance, whether or not to use indexes for
a given query, and which join methods to use when joining multiple tables.
These decisions have a tremendous effect on SQL performance, and query
optimization is a key technology for every application, from operational
Systems to data warehouse and analytical systems to content management
systems.
There is the various principle of Query Optimization are as follows −
 Understand how your database is executing your query − The first
phase of query optimization is understanding what the database is
performing. Different databases have different commands for this. For
example, in MySQL, one can use the “EXPLAIN [SQL Query]” keyword
to see the query plan. In Oracle, one can use the “EXPLAIN PLAN FOR
[SQL Query]” to see the query plan.
 Retrieve as little data as possible − The more information restored
from the query, the more resources the database is required to expand
to process and save these records. For example, if it can only require
to fetch one column from a table, do not use ‘SELECT *’.
 Store intermediate results − Sometimes logic for a query can be quite
complex. It is possible to produce the desired outcomes through the
use of subqueries, inline views, and UNION-type statements. For those
methods, the transitional results are not saved in the database but are
directly used within the query. This can lead to achievement issues,
particularly when the transitional results have a huge number of rows.
There are various query optimization strategies are as follows −
 Use Index − It can be using an index is the first strategy one should
use to speed up a query.
 Aggregate Table − It can be used to pre-populating tables at higher
levels so less amount of information is required to be parsed.
99

 Vertical Partitioning − It can be used to partition the table by columns.


This method reduces the amount of information a SQL query required
to process.
 Horizontal Partitioning − It can be used to partition the table by data
value, most often time. This method reduces the amount of information
a SQL query required to process.
 De-normalization − The process of de-normalization combines
multiple tables into a single table. This speeds up query implementation
because fewer table joins are required.
 Server Tuning − Each server has its parameters and provides tuning
server parameters so that it can completely take benefit of the
hardware resources that can significantly speed up query
implementation.

Encryption in database:
While the study of encryption is about trying to explain encryption logic
systematically through generalizations and propositions, encryption technology is,
based on encryption theories, a product of necessity aimed at creating the most
cost-efficient, profitable outcomes in line with economic principles. It is the
outcome of a process of transforming, refining, and amalgamating theories for
practical application.

According to business requirements, the location for encryption implementation


and the characteristics of data can differ. This is why encryption technology needs
to be qualified by its business value. Delving into encryption technology, therefore,
requires a broad, overall understanding of systems and businesses. This is a recent
example of a mobile messenger application encryption system that was designed
and implemented. At the most basic level, data encryption was needed in the
DBMS (Database Management System), which saves conversations communicated
through a messenger.
100

However, in light of general trends in security breaches, the above configuration is


incapable of providing sufficient security. In reality, web application encryption
needed to be implemented as follows:

The configuration includes the encryption of user authentication, section


encryption, message encryption, file encryption and key management.

All kinds of security system configurations follow secure IT system design


principles. Therefore, to perfectly deploy encryption technology, all layers
and areas of a system needed to be considered.
101

Encryption should be appropriately implemented in all three layers of the IT


system, namely the application, system, and network layers. With secure
key management, privilege management, and access control, solid
encryption is achievable.

Many companies or organizations are hesitant to implement encryption


because of the belief that utilizing encryption technology slows down the
system’s performance. Even for companies that do implement encryption,
they believe that a performance downgrade is simply a necessary trade-off
– a price that must be paid in order for their data to be secure.

However, encryption specialists should be able to help implement a system


that provides security at the same level for various environments and
minimize degradation of system performance. In most cases, the
degradation isn’t caused by issues of technology – it is caused by
insufficient understanding of the system and poorly designed applications.

In general, most security breaches occur due to poor security


administration because unrealistic expectations of a security tool’s
capabilities leave a user off-guard. The only way to prevent such incidents
is to improve security consciousness across the user group. Until people
gain a sufficient understanding of security and receive training in security
governance, can proper security administration take place.
102

Many security vendors may tout the following:

“If you purchase this encryption appliance, all your security concerns
will be resolved.”
But this is undoubtedly the wrong attitude: a specialized encryption vendor
will always the first talk about the need for security governance and
improving the security consciousness across an organization.

Akin to culture, encryption impacts all aspects of a business, including


design, development, and operations. Alongside the advancement of the
knowledge and information society, data protection is growing in
importance. Among various measures for protecting data, from the
technical reliability standpoint, the most important and fundamental
method is through encryption.

Kerberos
Kerberos provides a centralized authentication server whose function is
to authenticate users to servers and servers to users. In Kerberos
Authentication server and database is used for client authentication.
Kerberos runs as a third-party trusted server known as the Key
Distribution Center (KDC). Each user and service on the network is a
principal.
The main components of Kerberos are:

 Authentication Server (AS):


The Authentication Server performs the initial authentication and ticket
for Ticket Granting Service.

 Database:
The Authentication Server verifies the access rights of users in the
database.

 Ticket Granting Server (TGS):


The Ticket Granting Server issues the ticket for the Server

Kerberos Overview:
103

 Step-1:
User login and request services on the host. Thus user requests for
ticket-granting service.

 Step-2:
Authentication Server verifies user’s access right using database and
then gives ticket-granting-ticket and session key. Results are encrypted
using the Password of the user.

 Step-3:
The decryption of the message is done using the password then send
the ticket to Ticket Granting Server. The Ticket contains authenticators
like user names and network addresses.

 Step-4:
Ticket Granting Server decrypts the ticket sent by User and
authenticator verifies the request then creates the ticket for requesting
services from the Server.
104

 Step-5:
The user sends the Ticket and Authenticator to the Server.

 Step-6:
The server verifies the Ticket and authenticators then generate access
to the service. After this User can access the services.

Kerberos Limitations

 Each network service must be modified individually for use with


Kerberos
 It doesn’t work well in a timeshare environment
 Secured Kerberos Server
 Requires an always-on Kerberos server
 Stores all passwords are encrypted with a single key
 Assumes workstations are secure
 May result in cascading loss of trust.
 Scalability

Is Kerberos Infallible?

No security measure is 100% impregnable, and Kerberos is no exception.


Because it’s been around for so long, hackers have had the ability over
the years to find ways around it, typically through forging tickets, repeated
attempts at password guessing (brute force/credential stuffing), and the
use of malware, to downgrade the encryption.

Despite this, Kerberos remains the best access security protocol available
today. The protocol is flexible enough to employ stronger encryption
algorithms to combat new threats, and if users employ good password-
choice guidelines, you shouldn’t have a problem!

What is Kerberos Used For?

Although Kerberos can be found everywhere in the digital world, it is


commonly used in secure systems that rely on robust authentication and
auditing capabilities. Kerberos is used for Posix, Active Directory, NFS,
105

and Samba authentication. It is also an alternative authentication system


to SSH, POP, and SMTP.
data integrity

Data integrity refers to the overall accuracy, completeness, and reliability of data.
It can be specified by the lack of variation between two instances or consecutive
updates of a record, indicating that your information is error-free. It also
corresponds to the security and integrity controls and methods of regulatory
compliance.

Data integrity in a database is preserved by an array of error-checking and


validation procedures, rules, and principles executed during the integration flow
designing phase. These checks and correction procedures are based on predefined
business rules. For instance, the rules dictate to filter out the data with an incorrect
date or time value.

The question then arises, why is it imperative to maintain data integrity in


a database?

The importance of maintaining data integrity in a database is evident when creating


relationships between disparate data elements. It ensures that the data transferring
from one stage to another is accurate and error-free.

Data Integrity vs. Data Quality vs. Data Security

People often confuse data integrity with data security or data quality. However,
these three are related but different concepts.

Data security concerns measures taken to protect enterprise data from misuse. It
includes using methods and techniques that make your data inaccessible to
undesired parties or making selected data accessible to the desired parties. Data
security breaches can threaten the existence of an organization. On the other hand,
data integrity deals with the accuracy and completeness of data present in the
database.

The end goal of data security is to protect your data from external or internal
breaches. Thus, it is one of the many aspects of data integrity, but it isn’t extensive
enough to account for the numerous procedures essential for keeping your
information unaffected over time. Similarly, data quality is another facet of data
integrity, albeit a major one.
106

Data quality ensures that the data stored in your database is compliant with the
organization’s standards and requirements. In other words, it maintains integrity in
a database. In doing so, it applies a set of rules to a specific or complete dataset and
stores it in the target database. Moreover, data quality is data accuracy, which
refers explicitly to the correctness of stored values. Data integrity vs. data accuracy
can be understood by seeing data integrity as an umbrella term, whereby data
accuracy is one of the many categories.

Integrity of Data In a Database Table

Data integrity in a database covers all aspects of data quality and advances further
by executing several rules and procedures that oversee how information is entered,
deposited, transmitted and more.

Consider this example of data integrity. While the Salary of all employees is an
integer, one employee has a salary in alphanumeric characters. Since the Salary
table only accepts integers (INT), the value 697abc will not be accepted by the
database. This one-way data is protected by the database using domain-level data
integrity.

Let’s look at the two methods that help ensure data integrity.

Types of Data Integrity

Data integrity is applied in all databases can be categorized into two main types:

Physical Integrity

Protecting data against external factors, such as natural calamities, power outages,
or hackers, falls under the domain of physical integrity. Moreover, human faults,
storage attrition, and several other problems can make data operators unable to
obtain information from a database.

Logical Integrity

It concerns the rationality of data present within the relational database. Logical
integrity constraints can be categorized into four types:

Entity Integrity
107

It depends on the making of primary keys or exclusive values that classify data
items. The purpose is to ensure that data is not recorded multiple times (i.e., each
data item is unique), and the table has no null fields.

Entity integrity is a critical feature of a relational database that stores data in a


tabular format, which can be interconnected and used in various ways.

Referential Integrity

It denotes a series of procedures that ensure proper and consistent data storage and
usage. Referential integrity ensures that only the required alterations, additions, or
removals happen via rules implanted into the database’s structure about how
foreign keys are used.

These rules might include conditions that remove duplicate data records, warrant
that data is precise, and prohibit unsuitable recording data.

Domain Integrity

It’s an assortment of procedures that ensures the precision of every data item is
maintained in a domain. Here, a domain is defined as a set of suitable values that a
column is permitted to enclose.

Domain integrity encompasses rules and other processes restricting the format,
type, and volume of data recorded in a database. It ensures that every column in a
relational database is in a defined domain.

User-Defined Integrity

It comprises the rules defined by the operator to fulfill their specific requirements.
Entity, referential, and domain integrity are not enough to refine and secure data.
Time in time again, particular business rules must be considered and integrated
into data integrity processes to meet enterprise standards.

Importance of Integrity in Data

Data integrity in a database is essential because it is a necessary constituent of data


integration. If data integrity is maintained, data values stored within the database
are consistent about the data model and type. Thus, reliable insights can then be
gained from the data model so users can make informed business decisions.
108

Here are some examples of data integrity at risk:

 An attempt to enter a phone number in the wrong format.


 A developer accidentally tries to insert the data into the wrong table while
transferring data between two databases.
 An attempt to delete a record in a table, but another table references that
record as part of a relationship.
 A user accidentally tries to enter a phone number into a date field.

Distributed Database System


A distributed database is basically a database that is not limited to one system, it
is spread over different sites, i.e, on multiple computers or over a network of
computers. A distributed database system is located on various sites that don’t
share physical components. This may be required when a particular database
needs to be accessed by various users globally. It needs to be managed such that
for the users it looks like one single database.

Types:

1. Homogeneous Database:
In a homogeneous database, all different sites store database identically. The
operating system, database management system, and the data structures used –
all are the same at all sites. Hence, they’re easy to manage.

2. Heterogeneous Database:
In a heterogeneous distributed database, different sites can use different schema
and software that can lead to problems in query processing and transactions.
Also, a particular site might be completely unaware of the other sites. Different
computers may use a different operating system, different database application.
They may even use different data models for the database. Hence, translations
are required for different sites to communicate.
Distributed Data Storage :
109

There are 2 ways in which data can be stored on different sites. These are:

1. Replication –
In this approach, the entire relationship is stored redundantly at 2 or more sites. If
the entire database is available at all sites, it is a fully redundant database. Hence,
in replication, systems maintain copies of data.

This is advantageous as it increases the availability of data at different sites. Also,


now query requests can be processed in parallel.
However, it has certain disadvantages as well. Data needs to be constantly
updated. Any change made at one site needs to be recorded at every site that
relation is stored or else it may lead to inconsistency. This is a lot of overhead.
Also, concurrency control becomes way more complex as concurrent access now
needs to be checked over a number of sites.

2. Fragmentation –
In this approach, the relations are fragmented (i.e., they’re divided into smaller
parts) and each of the fragments is stored in different sites where they’re
required. It must be made sure that the fragments are such that they can be used
to reconstruct the original relation (i.e, there isn’t any loss of data).
Fragmentation is advantageous as it doesn’t create copies of data, consistency is
not a problem.

Fragmentation of relations can be done in two ways:

Horizontal fragmentation – Splitting by rows –


110

The relation is fragmented into groups of tuples so that each tuple is assigned to
at least one fragment.
Vertical fragmentation – Splitting by columns –
The schema of the relation is divided into smaller schemas. Each fragment must
contain a common candidate key so as to ensure a lossless join.
In certain cases, an approach that is hybrid of fragmentation and replication is
used.

Applications of Distributed Database:

It is used in Corporate Management Information System.


It is used in multimedia applications.
Used in Military’s control system, Hotel chains etc.
It is also used in manufacturing control system.
Client-server Database Architecture in DBMS
Client-server Database Architecture in DBMS
In client-server architecture many clients connected with one server. The server is
centerlines.it provides services to all clients. All clients request to the server for
different Service. The server displays the results according to the client’s request.

Client/server architecture in DBMS:


Client/server architecture is a computing model in which the server hosts
(computer), send and manages most of the resources and works to be required by
the client. In this type of architecture has one or more client computers attached
to a central server over a network. This system shares different resources.
111

Client/server architecture is also called as a networking computing model and


client-server network because all the requests and demands are sent over a
network.
Working of Client-server Database Architecture in DBMS
Basically client-server model defines how the server provides services to clients
Server is a centralized computer that provides services to all attach clients. For
example, file server, web server, etc. each the basic work of server to provide
services to each client. The client can be a laptop computer, tablets, and
smartphones, etc. The server has many types of relationship with clients. Many
servers have one too many relationships with clients.in one too many
relationships many clients connected with one server. When one client wants to
communicate with the server. The server may be accepted or rejects the request
of clients. When the server computer accepts the request of clients than server
maintains a connection according to a defined protocol. The protocol rules over
the network.

That must be followed for any network connection. If the one client wants to send
an email over the network. It requests the server, the SMTP (the protocol that is
SMPT is standing for simple mail transfer protocol that used to transfer a mail
over the network. SMTP is a set of commands or commands that check
authentication and the transfer of email. When configuring the settings for your
e-mail program, you usually need to set the SMTP server to your local Internet
Service Provider’s SMTP settings. After all process, the server will transfer e-mail
to the desired client.
Another example of a client-server model is online gaming.in online gaming.
Structure of Client-server Database Architecture in DBMS
By using this architecture structure this software is divided into three different
tiers: Presentation tier
Logic tier
Data-tier

You might also like